by SEYMOUR LIPSCHUTZ wy 


Dis. selwed i 3 
tes problems_ a 
| eigen Solved in Detail Re ae . 
_ SCHAUM S OUTLINE SERIES a 


na “ McGRAW- HILL BOOK. COMPANY. 


A Kiet 4 f f : > oe 
j - . ¢ Z 2 2 f 
? A j ‘ ; 
: aoa 
f } st. 
2-5 BES 
” yi 
fe 7: 
ay 


SCHAUM’S OUTLINE OF 


THEORY AND PROBLEMS 


OF 


LINEAR 
ALGEBRA 


BY 


SEYMOUR LIPSCHUTZ, Ph.D. 


Associate Professor of Mathematics 


Temple University 


SCHAUNMS OUTLINE SERIES 
McGRAW-HILL BOOK COMPANY 
New York, St. Louis, San Francisco, Toronto, Sydney 


< z a H é. i = - ; “ a Be 


Copyright © 1968 by McGraw-Hill, Inc. All Righ hts Reserved. Printed i in| 
United States of America. No part of this publication may be 
stored inya retrieval system, or transmitted, in any form or by an, Yr 
electronic, mechanical, photocopying, recording, or. otherwise, without 
_ prior written permission of the publisher. i ors. ay 


37989 ng ee ee 
(78910 SH SH 7548 3210 


Preface 


Linear algebra has in recent years become an essential part of the mathematical 
background required of mathematicians, engineers, physicists and other scientists. 
This requirement reflects the importance and wide applications of the subject matter. 


This book is designed for use as a textbook for a formal course in linear algebra 
or as a supplement to all current standard texts. It aims to present an introduction to 
linear algebra which will be found helpful to all readers regardless of their fields of 
specialization. More material has been included than can be covered in most first 
courses. This has been done to make the book more flexible, to provide a useful book 
of reference, and to stimulate further interest in the subject. 


Each chapter begins with clear statements of pertinent definitions, principles and 
theorems together with illustrative and other descriptive material. This is followed 
by graded sets of solved and supplementary problems. The solved problems serve to 
illustrate and amplify the theory, bring into sharp focus those fine points without 
which the student continually feels himself on unsafe ground, and provide the repetition 
of basic principles so vital to effective learning. Numerous proofs of theorems are 
included among the solved problems. The supplementary problems serve as a complete 
review of the material of each chapter. 


The first three chapters treat of vectors in Euclidean space, linear equations and 
matrices. These provide the motivation and basic computational tools for the abstract 
treatment of vector spaces and linear mappings which follow. A chapter on eigen- 
values and eigenvectors, preceded by determinants, gives conditions for representing 
a linear operator by a diagonal matrix. This naturally leads to the study of various 
canonical forms, specifically the triangular, Jordan and rational canonical forms. 
In the last chapter, on inner product spaces, the spectral theorem for symmetric op- 
erators is obtained and is applied to the diagonalization of real quadratic forms. For 
completeness, the appendices include sections on sets and relations, algebraic structures 


and polynomials over a field. 
I wish to thank many friends and colleagues, especially Dr. Martin Silverstein and 


Dr. Hwa Tsang, for invaluable suggestions and critical review of the manuscript. 
T also want to express my gratitude to Daniel Schaum and Nicola Monti for their very 


helpful cooperation. 
SEYMOUR LIPSCHUTZ 


Temple University 
January, 1968 


be teed <a 
POs ete 


Fay aly 
neha Reed 
, 


CONTENTS 


Page 

Cater ti. eVECTORS IN R® AND-C*.......-.0.0) es 1 
Introduction. Vectors in R™. Vector addition and scalar multiplication. Dot 
product. Norm and distance in R”. Complex numbers. Vectors in C”. 

Chee ee TINEARVEQUATIONS «02... 0.a/sséo £0) face nea: 18 
Introduction. Linear equation. System of linear equations. Solution of a sys- 
tem of linear equations. Solution of a homogeneous system of linear equations. 

Sp MO MEENUATRICH Sg oS oe ele oR Or a ene ae 35 
Introduction. Matrices. Matrix addition and scalar multiplication. Matrix 
multiplication. Transpose. Matrices and systems of linear equations. Echelon - 
matrices. Row equivalence and elementary row operations. Square matrices. 
Algebra of square matrices. Invertible matrices. Block matrices. 

Chapter 4 VECTOR SPACES AND SEBSPACES .«.....0...-00 000000 wus Sas 63 
Introduction. Examples of vector spaces. Subspaces. Linear combinations, 
linear spans. Row space of a matrix. Sums and direct sums. 

Chapter d BA STS VAN DrDIMENSTON oo va elle ioc ole wie: pscuahe & aus 6 nave saeco 86 
Introduction. Linear dependence. Basis and dimension. Dimension and sub- 
spaces. Rank of a matrix. Applications to linear equations. Coordinates. 

eerie Oe LINEAR MAPPINGS. «7 2 cos. vc. «ssid won des a opinics feds adade 121 


Mappings. Linear mappings. Kernel and image of a linear mapping. Singular 
and nonsingular mappings. Linear mappings and systems of linear equations. 
Operations with linear mappings. Algebra of linear operators. Invertible 
operators. 


Chapter 7 MATRICES AND LINEAR OPERATORS ...............-..--- 150 


Introduction. Matrix representation of a linear operator. Change of basis. 
Similarity. Matrices and linear mappings. 


Geieeine OE TERMINANTS ..occcoc cd conc es cbs ode nb ae tav desea tones 171 
Introduction. Permutations. Determinant. Properties of determinants. Mi- 
nors and cofactors. Classical adjoint. Applications to linear equations. Deter- 
minant of a linear operator. Multilinearity and determinants. 


Chapter 9 EIGENVALUES AND EIGENVECTORS ...............----45, 197 
Introduction. Polynomials of matrices and linear operators. Eigenvalues and 
eigenvectors. Diagonalization and eigenvectors. Characteristic polynomial, 
Cayley-Hamilton theorem. Minimum polynomial. Characteristic and minimum 
polynomials of linear operators. 


CONTENTS 


Page 

Chapter 10° - CANONICAL FORMS... ..2.)o30500 00-02 ee ee 222 
Introduction. Triangular form. Invariance. Invariant direct-sum decom- 
positions. Primary decomposition. Nilpotent operators, Jordan canonical 
form. Cyclic subspaces. Rational canonical form. Quotient spaces. 

Chapter 11 LINEAR FUNCTIONALS AND THE DUAL SPACE ........... 249 
Introduction. Linear functionals and the dual space. Dual basis. Second dual 
space. Annihilators. Transpose of a linear mapping. 

Chapter 12 BILINEAR, QUADRATIC AND HERMITIAN FORMS ........ 261 
Bilinear forms. Bilinear forms and matrices. Alternating bilinear forms. 
Symmetric bilinear forms, quadratic forms. Real symmetric bilinear forms. 

Law of inertia. Hermitian forms. 

Chapter 13 INNER PRODUCT SPACES. «0.0 begin 212365 ih eee ee eee 279 
Introduction. Inner product spaces. Cauchy-Schwarz inequality. Orthogo- 
nality. Orthonormal sets. Gram-Schmidt orthogonalization process. Linear 
functionals and adjoint operators. Analogy between A(V) and C, special 
operators. Orthogonal and unitary operators. Orthogonal and unitary mat- 
rices. Change of orthonormal basis. Positive operators. Diagonalization and 
canonical forms in Euclidean spaces. Diagonalization and canonical forms in 
unitary spaces. Spectral theorem. 

Appendix A. SETS AND RELATIONS. 3.3.50: 360.0 2 ee Ee ee 315 
Sets, elements. Set operations. Product sets. Relations. Equivalence 
relations. 

Appendix B ALGEBRAIC STRUCTURES ........................0......055 320 
Introduction. Groups. Rings, integral domains and fields. Modules. 

AppendixC POLYNOMIALS OVER A FIELD ............................. 327 


Introduction. Ring of polynomials. Notation. Divisibility. Factorization. 


Oi) 817058 .c 0 0) 165.6) ey, 6.84.0... 61 @hiey ee" 6) «m9. Bitel ele) \6) ele 8. 
OOP 0.6 Oiey 9. (er 6 Soe) 0. \eiiel ie Je) /e\ erie 0) 0) ee: e's) © © © 6 oe ‘ate’ «)en-8» Q0i 0) .8) © ©,0) a) ufo Ape) oe thine ta fet! «tel attain 


Chapter 1 


Vectors in R’ and C’ 


INTRODUCTION 


In various physical applications there appear certain quantities, such as temperature 
and speed, which possess only “magnitude”. These can be represented by real numbers and 
are called scalars. On the other hand, there are also quantities, such as force and velocity, 
which possess both “magnitude” and “direction”. These quantities can be represented by 
arrows (having appropriate lengths and directions and emanating from some given ref- 
erence point O) and are called vectors. In this chapter we study the properties of such 
vectors in some detail. 


We begin by considering the following operations on vectors. 


(i) Addition: The resultant u+v of two vectors u 

: and v is obtained by the so-called parallelogram 
law, i.e. u+ v is the diagonal of the parallelogram 
formed by u and v as shown on the right. 


(ii) Scalar multiplication: The product ku of a real 
number k by a vector u is obtained by multiplying Su 

the magnitude of u by & and retaining the same a 

direction if k=O or the opposite direction if ou 

k <0, as shown on the right. 

Now we assume the reader is familiar with the representation of the points in the plane 
by ordered pairs of real numbers. If the origin of the axes is chosen at the reference point 
O above, then every vector is uniquely determined by the coordinates of its endpoint. The 
relationship between the above operations and endpoints follows. 


(i) Addition: If (a, b) and (c,d) are the endpoints of the vectors u and v, then (a+ ¢, b+d) 
will be the endpoint of u+ v, as shown in Fig. (a) below. 


(Gar @s PC) 


ku (ka, kb) 
(a, b) 


Fig. (a) Fig. (0) 


(ii) Scalar multiplication: If (a,b) is the endpoint of the vector u, then (ka, kb) will be the 
endpoint of the vector ku, as shown in Fig. (6) above. 


é 1 


2 VECTORS IN Rn AND Cn [CHAP. 1 


Mathematically, we identify a vector with its endpoint; that is, we call the ordered pair 
(a, b) of real numbers a vector. In fact, we shall generalize this notion and call an n-tuple 
(a1, @2,...,@n) of real numbers a vector. We shall again generalize and permit the co- 
ordinates of the n-tuple to be complex numbers and not just real numbers. Furthermore, in 
Chapter 4, we shall abstract properties of these n-tuples and formally define the mathe- 
matical system called a vector space. 


We assume the reader is familiar with the elementary properties of the real number 
field which we denote by R. 


VECTORS IN R” 


The set of all n-tuples of real numbers, denoted by R®*, is called n-space. A particular 
n-tuple in R”, say 
U8 (UE Ue, oer toe) 
is called a point or vector; the real numbers wu; are called the components (or: coordinates) 
of the vector wu. Moreover, when discussing the space R” we use the term scalar for the 
elements of R, i.e. for the real numbers. 


Example 1.1: Consider the following vectors: 
(0, 1); (1, ao) (1, 2, v3, 4), (Gos 4, 0, 77) 


The first two vectors have two components and so are points in R2; the last two 
vectors have four components and so are points in R#. 


Two vectors u and v are equal, written u=v, if they have the same number of com- 
ponents, i.e. belong to the same space, and if corresponding components are equal. The 
vectors (1, 2,3) and (2,3,1) are not equal, since corresponding elements are not equal. 


Example 1.2: Suppose (x—y,%+y,z—1) = (4, 2,3). Then, by definition of equality of vectors, 


Ay Se SS 
pee OR = Ve 
ae iO 
Solving the above system of equations gives « = 8, y=-—1l, and z= 4. 


VECTOR ADDITION AND SCALAR MULTIPLICATION 


Let u and v be vectors in R": 
UW" =. (is Uo, Vs Un) ON = =i io, Nes) 
The swum of u and v, written u + v, is the vector obtained by adding corresponding components: 
Wt VY = (ta + V1, U2 V2, 3.25) Un Op) 


The product of a real number k by the vector uw, written kw, is the vector obtained by multi- 
plying each component of wu by k: 


ku = (ku, kus, ..., kun) 
Observe that w+v and ku are also vectors in R”. We also define 
—“ = -lu and. u—v = u+(-v) 


The sum of vectors with different numbers of components is not defined. 


CHAP. 1] VECTORS IN Rn AND Cn 3 


Example 1.3: Let u = (1, —3, 2,4) and » = (3, 5, =1, —2).: * Then 


MER ma At eS, =O a Dien di 22) ae (A St 58) 
du = (5°1,5+(—3),5°2,5°4) = (5, —15, 10, 20) 
2u— 3v = (2, —6, 4,8) + (—9, —15, 3,6) = (—7,—21, 7, 14) 
Example 1.4: The vector (0, 0, ..., 0) in FP”, denoted by 0, is called the zero vector. It is similar 
to the scalar 0 in that, for any vector u = (Wis Ups eoktln) 
UE Tpe OF EO cata. O ivan tO) n= (Wag Une acee aN Sa ete 


Basic properties of the vectors in R” under the operations of vector addition and scalar 
multiplication are described in the following theorem. 


Theorem 1.1: For any vectors u,v,w € R” and any scalars k,k’ ER: 
(i) (utv)+w = 4ut+(vt+w) (WW) ho) = hase kp 
(ii) wt+0 = 4 (vi) (A+Kk )u = ku ku 
Gane eas) 0 (vii) (kk’)u = k(k’u) 
(iv) utv=vt+4u ( 

Remark: Suppose wu and v are vectors in R" for which u=kv for some nonzero scalar 


k ER. Then u is said to be in the same direction as v if k>0, and in the op- 
posite direction if k <0. 


DOT PRODUCT 
Let u and v be vectors in R’: 
Witt Ute Uige. nosy Unie ANGRY 2 1 aei U1; Va, be an) 
The dot or inner product of wu and v, denoted by u-v, is the scalar obtained by multiplying 
corresponding components and adding the resulting products: 
UPV = Uiv1 + Usve'+ +°+ + UnVn 
The vectors u and v are said to be orthogonal (or: perpendicular) if their dot product is 
zero: ur-v=0. 
Example 1.5: Let u = (1,—2,8,—4), v =(6,7,1,—2) and w= (5,—4,5,7). Then 
in =i FAG a (—2) te 821 te (—4) * (2) 3 Ge 8. = 3 
usw = 1°5 + (-2)°(—4) + 8°5 + (-4)°7 = 54+8+4 15 — 28 = 0 
Thus u and w are orthogonal. 
Basic properties of the dot product in R” follow. 
Theorem 1.2: For any vectors u,v,w €R" and any scalar k ER: 
(i) (utv)-w = Uwtrurw (iii) wev = vu 
(ii) (ku):v = kurv) (iv) ucu=0, and u-u=0 iff u=0 
Remark: The space R” with the above operations of vector addition, scalar multiplication 
and dot product is usually called Euclidean n-space. 


NORM AND DISTANCE IN R" 
Let wu and v be vectors in R% wu = (U1,U2,...,Un) and v = (v1, V2,...,Un). The dis- 
tance between the points u and v, written d(u,v), is defined by 


d(u,v) = V(ua— 01)? + (U2 — 02)? + + + (Un Un)” 


4 VECTORS IN Rv AND Cr [CHAP. 1 


The norm (or: length) of the vector u, written ||2||, is defined to be the nonnegative square 


root of u-u: 
Weel |e 4/408 Oe aan ea a re 
By Theorem 1.2, u-«u=0 and so the square root exists. Observe that 
d(u,v) = ||u— v|| 


Example 16: Let) u = (1, —2)4, 1) yand v= (3) 1,5, 0). Then: 
d(u,v) = V—3)2 + (2-1) + (445) + 1-0)? = V95 
ol| = VETERE COP EO = V5 
Now if we consider two points, say p= (a,b) and q=(e,d) in the plane R’, then 
llp|| = Va? + 6? and d(p,q) = V(a—e)? + (b-— dy 
That is, ||p|| corresponds to the usual Euclidean length of the arrow from the origin to the 


point p, and d(p,q) corresponds to the usual Euclidean distance between the points p and 
qd, as shown below: 


re (a, b) 


A similar result holds for points on the line R and in space R’. 


Remark: A vector e is called a wnit vector if its norm is 1: |le||=1. Observe that, for 
any nonzero vector «€ R", the vector e, = u/||u|| is a unit vector in the same 
direction as wu. 


We now state a fundamental relationship known as the Cauchy-Schwarz inequality. 


Theorem 1.3 (Cauchy-Schwarz): For any vectors u,v © R*, |w:v| = |lul| ||o||. 


Using the above inequality, we can now define the angle @ between any two nonzero 


vectors u,v € R” by has 


eel Te] 


cosg = 


| 
Note that if u-v=0, then 6=90° (or: 96=7/2). This then agrees with our previous 
definition of orthogonality. 


COMPLEX NUMBERS 


The set of complex numbers is denoted by C. Formally, a complex number is an 
ordered pair (a,b) of real numbers; equality, addition and multiplication of complex num- 
bers are defined as follows: 


(GD) (ej. 0)" VA hc sane Or 1 
(a, 0). (es d)- = (Gt es Dad) 
(a, b)(e, d) = (ac — bd, ad + be) 


CHAP. 1| VECTORS IN R» AND Cx 


We identify the real number a with the complex number (a, 0): 


a <= (a, 0) 


This is possible since the o i +43 ee 
perations of addition and multiplication of real 
preserved under the correspondence: numbers are 


(a, 0) + (0,0) = (a+b,0) and (a, 0)(b, 0) = (ab, 0) 
Thus we view R as a subset of C and replace (a, 0) by a whenever convenient and possible. 
The complex number (0,1), denoted by i, has the important property that 
? = tt = (0,1)(0,1) = (-1,0) = -1 or i= y-1 
Furthermore, using the fact 
(a,b) = (a,0)+(0,b) ‘and  (0,b) = (6, 0)(0, 1) 
we have (a,b) = (a, 0) + (b, 0)(0,1) = a+ bi 


The notation a+ bi is more convenient than (a,b). For example, the sum and product of 
eee numbers can be obtained by simply using the commutative and distributive laws 
and 7? = —1: 
(a+ bi)+(e+dt) =a+c+bi+di = (at+ce)+(b+d)i 
(a+ bi)\(e+ di) = ac+ bei + adi + bdi? = (ac— bd) + (be+ad)i 
The conjugate of the complex number z= (a,b) =a+bi is denoted and defined by 
2 =a —'b2 
(Notice that 22 =a?+b*.) If, in addition, 20, then the inverse z~! of z and division by 
z are given by 
Zz a —b . w 
Zz 


aa i oe 
f Ze +b! a+b" ane 


where w€C. We also define 
eed ee WON 0. ieee We) 
Example 1.7: Suppose 2=2+3i and w=5—2i. Then 
Zw = (2+ 31) + 6 — 21) = 245430721 = a4 
zw = (24+ 31)(5—27) = 10+15:— 41-62 = 16+ 112 
Z= 248 = 2-81 “and .w =/5 > 2i = 612 
Se GARGS Boy bi = 4 LO 


D2 80 ar 82) (2 232) 13 13 «13 


x|s 


Just as the real numbers can be represented by the 
points on a line, the complex numbers can be represented 
by the points in the plane. Specifically, we let the point 
(a, b) in the plane represent the complex number z=a+ bt, 
i.e. whose real part is a and whose imaginary part is b. The 
absolute value of z, written |z|, is defined as the distance 


from z to the origin: 
kl = Vere 
Note that |z| is equal to the norm of the vector (a,b). Also, |z| = V2. 


Example 1.8: Suppose 2=2+3i and w=12—5i. Then 


le] = V44+9 = V18 and |w| = V144+25 = 13 


6 VECTORS IN R” AND C” [CHAP. 1 


Remark: In Appendix B we define the algebraic structure called a field. We emphasize 
that the set C of complex numbers with the above operations of addition and 
multiplication is a field. 


VECTORS IN C* 


The set of all n-tuples of complex numbers, denoted by C”, is called complex n-space. 
Just as in the real case, the elements of C” are called points or vectors, the elements of C 
are called scalars, and vector addition in C" and scalar multiplication on C" are given by 


(21, 22, .. +, Zn) + (wi, We, ..., Wn) = (21+ Wi, 22+ We, ..., 2n + Wn) 
B21), 225 oie 0 Sn) Rey Ady oe ey eee) 
where 2i, wi, 2 € C. 
Example 19:  (2-+84,4—4,3) + (8-24 54,4-6) = (6+4,4+44, 7-6) 
2i(2 + 81,4—-71,8) = (—6+ 42, 2+ 81, 61) 
Now let wu and v be arbitrary vectors in C”: 
UA= ein eos en); Us = s(2U4, 002; 4. etOn)s z,wWwi EC 
The dot, or inner, product of uw and v is defined as follows: 
UY = ZiWi + Zee ++ + ean 


Note that this definition reduces to the previous one in the real case, since wi= W: when 
w; is real. The norm of w is defined by 


Ml] = Vurw = Vaibi + Zoho + +++ + Anbu = Vlei? + feo}? + --- + len? 
Observe that w+ and so ||u|| are real and positive when u #0, and 0 when u=0. 


Example 1.10: Let uw = (24+ 3i,4—7, 27) and v = (8—27,5,4—6i). Then 
uev = (24+81)(38 — 21) + (4—1(5) + (21)(4— 61) 
= (2+ 31)(3 + 2i) + (4—1)(5) + (2)(4 + 6i) 
= 13i+20—5i—12+81 = 8+ 16i 


usu = (2+ 38i)(2+ 3%) + (4—1)(4—2) + (21)(21) 
= (2+ 3i)(2— 32) + (4—i(4+4%) + (2i(—27) 
= 1841744 = 34 


IJul| = Vurw = V34 


The space C” with the above operations of vector addition, scalar multiplication and dot 
product, is called complex Euclidean n-space. 


Remark: If u:v were defined by w-v = 21:W1 + -+- + 2nWn, then it is possible for 
u:u=0 even though u#0, eg. if w=(1,7,0). In fact, w*u may not even 
be real. 


CHAP. 1} ~VECTORS IN R= AND 7 


Solved Problems 


VECTORS IN R* 

11. Compute: (i) (8,—4,5) + (1,1,—2); (ii) (1,2,-8) + (4,-5); (iii) —8(4, —5, —6); 
(ivje=(—6, 7,.-8). 
(i) Add corresponding components: (3, —4)5) 42 1, 1, 2) = (8 41, —4-+1,5 2), = 43,8) 
(ii) The sum is not defined since the vectors have different numbers of components. 
(iii) Multiply each component by the scalar: —3(4, —5, —6) = (—12, 15, 18). 
(iv) Multiply each component by —1: —(—6\7,—8) — (65—7, 8). 


1.2. Let w= (2,—7,1), v = (-8,0,4), w =(0,5,—-8). Find (i) 8u—4v, (ii) 2u+3v—5w. 
First perform the scalar multiplication and then the vector addition. 
(i) 38%—4v = 8(2, —7,1) — 4(—8, 0,4) = (6, —21, 3) + (12, 0, -16) = (18, —21, —13) 
(ii) 2u+ 3v—5w = 2(2, —7, 1) + 3(-8, 0, 4) — 5(0, 5, —8) 
(4, —14, 2) + (—9, 0, 12) + (0, —25, 40) 
(4—9+0,—-14+0—25,2+12+ 40) = (—5, —39, 54) 


1.3. Find @ and y if (xz, 3) = (2,2+,y). 
Since the two vectors are equal, the corresponding components are equal tc each other: 
ee 2S BS oe iy 
Substitute 2 = 2 into the second equation to obtain y=1. Thus 4 =2 and y=1. 


1.4. Find w and y if (4,y) = x(2,3). 
Multiply by the scalar x to obtain (4,y) = x(2,3) = (2a, 3a). 
Set the corresponding components equal to each other: 4= 2x, y = 3x. 


Solve the linear equations for x and y: «=2 and y=6. 


1.5. Find x, y and z if (2,—8,4) = 2(1,1,1)+y(1,1, 0) +2(1,0,0). 
First multiply by the scalars x, y and z and then add: 
(231A yee needs 2) -y( 1,10): te (ad 0,20), 
(2, %, x) + (y, y, 0) + (z, 0, 0) 
= Cae area ee ee 


Now set the corresponding components equal to each other: 
eye = 2; Raph = 8s Aiea 7| 


To solve the system of equations, substitute «= 4 into the second equation to obtain 4+ y = —3 
or y=-—v7. Then substitute into the first equation to find z=5. Thus + =4, y=—T, 2=5. 


1.6. Prove Theorem 1.1: For any vectors u,v,w € R” and any scalars k,k’ ER, 
(i) (utv)+w = u+(vtw) (v) k(ut+v) = kut+kv 
(ii) wt0 =u (vi) (k+k’)u = ku+t k’u 
(iii) w+(—u) = 0 (vii) (kk’)u = k(k’u) 
(iv) utv=vtr4 (viii) lu = wu 


Let u;, v; and w, be the ith components of u, v and w, respectively. 


i 


(ii) 


(iii) 


(iv) 


(vi) 


(vii) 


(viii) 


VECTORS IN R AND C2 [CHAP. 1 


By definition, u; + v; is the ith component of u+v and so (u,+v;) + w; Is the ith component 
of (wt+v)+w. On the other hand, v; + w; is the ith component of v-+w and so u; + (v; + Wj) 
is the ith component of u+(v+w). But u;, v; and w; are real numbers for which the as- 
sociative law holds, that is, 

(mu, + v,) + wy = Uy (YO; +) BOND emi emenl bh arr 1) 


Accordingly, (wt+v) +w = u+(v+w) since their corresponding components are equal. 


Here, 0 = (0,0, ..., 0); hence 
A OF cS ay Ugy ss 35 y) OO aks 0) 
= (uy + 0, tg +0, 205 ey POMS “Uy Moy...) Ue 
Since w= 1 (ty hg, «ny Ug) = (— Ug Sg on) 
We (—w) SS (Uys Vana A (HN oe ey) 
Sa P pl ie Cb oh a Oy D5 son OD) SS" © 


By definition, u; +; is the ith component of u+v, and v,+ wis the ith component of v+u. 
But uw; and v,; are real numbers for which the commutative law holds, that is, 


U,+V;, = VT UU, 7=1,...,n 


Hence wu+v=v+w since their corresponding components are equal. 


Since u;+ 0; is the ith component of u+v, k(u;+4;) is the ith component of k(u+v). Since 
ku, and kv; are the ith components of ku and kv respectively, ku; + kv; is the ith component 
of ku+kv. But k, uw; and v; are real numbers; hence 


k(u,t+v;) = ku, + kv, (Tew Weg ae 


Thus k(u+v) = ku+ kv, as corresponding components are equal. 


Observe that the first plus sign refers to the addition of the two scalars k and k’ whereas the 
second plus sign refers to the vector addition of the two vectors ku and k’u. 


By definition, (k + k’)u; is the ith component of the vector (k+k’)u. Since ku, and k’u; 
are the ith components of ku and k’u respectively, ku; + k’u; is the ith component of ku+ k’u. 
But k, k’ and wu; are real numbers; hence 


(k+k’)u, = ku, + k’u,, bi Merete 
Thus (k+k’)u = ku+k’u, as corresponding components are equal. 


Since k’u; is the 7th component of k’u, k(k’u,) is the ith component of k(k’u). But (kk’)u; is the 
ith component of (kk’)u and, since k, k’ and u,; are real numbers, 


(kk')u; = k(k’u,), CSA ye 


Hence (kk’)u = k(k’u), as corresponding components are equal. 


Wee (tg, tg,» sg)! (Ly, Atty 2s vy Lt) (064, owe wae te 2 etl 


Show that 0w=0 for any vector uw, where clearly the first 0 is a scalar and the second 


0 a vector. 
Method 1: Ou = O(a, ty, ..., Un) = (Or, Ou, ..., Ou,) = (0,0, ...,0) = 0 
Method 2: By Theorem 1.1, Ou = (0+0)u = Ow + Ou 


Adding —0u to both sides gives us the required result. 


DOT PRODUCT 


Compute u*v where: (i) u = (2, —3,6), v = (8,2,—8); (ii) wu = (1,—-8,0,5), v = (8,6, 4); 
(ili) «= (8,—5,2,1), v = (4,1, —2,5). 


1.8. 


(i) 
(ii) 


Multiply corresponding components and add: wev = 2*8 + (—38)°2+6+*(—3) = —8, 


The dot product is not defined between vectors with different numbers of components. 


(iii) Multiply corresponding components and add: u*v = 3-4 + (—5)°1+2+(—2)+1°5 = 8. 


CHAP. 1] VECTORS IN Rm AND Cn Ss) 


1.9. Determine k so that the vectors u and v are orthogonal where 
(i) w= (1,k,—-8) and v = (2, —5, 4) 
GU aio (Soiy — 4.1. 5) and v = (6, —1, 3, 7, 2k) 


In each case, compute u° v, set it equal to 0, and solve for k. 


Qian Vd 2 - ke(—5) +(-8)..4 “= 2 — bk = 12° ='0, —bk—10 = 0. > k= 2 
Gi tae ee © 26 + 3k*(—1) + (—4):3 4+ 1+ 7 45+ OE 
= 12-— 8k -—12+7+10k = 0, k=-1 


1.10. Prove Theorem 1.2: For any vectors u,v,w €R" and any scalar k ER, 


(i) (Ut+v):w = urwtoew (ili) usv = veu 
(ii) (ku)-v = k(urv) (Avje.2e? 2 =O) and 2:0, = 0s afl <i 20 
Let wu = (uy, Ug, «.-,Un), V = (V4, Vo, .. Un), W = (Wy, We, ©. -,Wy) 
(i) Since w+ v = (uy +4, Ugt Vo, ...,U_ tn), 
(utv)sw = (uy + v1)wWy + (Ug t Ve)We + 22+ + (U, + U2)Wy 
= UW, + Vy~Wy + UgWo + VoWs + +++ + UQWy + UpWn 
= (Uywy + UW + °° HF UpW,) + (VW + VoWa + +>: + Vp_Wh) 


= Uurwtvuew 


(li) Since ku = (kuy, kug, ..., ku,), 
(Kujo = kuyv, + kod, + >>> + ku,w, = Kuyvy + uUgvg t+ --> +u,v,) = kuev) 


(Hi) w2O = Uyvy + Ugo + °° + UpQVyn = Vey + VQle +252 + Vpn = VOU 


(iv) Since ua is nonnegative for each 7, and since the sum of nonnegative real numbers is non- 
negative 2 
' Ure = a tugt--- +2 = 0 


Furthermore, u:u=0 iff u;—=0 for each 2, that is, iff u=0. 


DISTANCE AND NORM IN R* 


1.11. Find the distance d(u, v) between the vectors u and v where: (i) w= (1,7), v = (6,—5); 
(iijew = (8, —5, 4) © = (6, 2, —1);* (iii) w= (5,3,—2, —4,—1), v = (2, —1,0,—7,2). 


In each case use the formula d(u, v) = V(u,—v1)2 + +++ + (Un — Un)? 
(i) d(u,v) = VA—6)2 + (7+5)? = V25+ 144 = V169 = 13 
(ii) d(u,v) = V@—62+ (5-2 + 4412 = V9+494+ 25 = 83 


(5 —2)2 + (8+1)2 + (-2+0)2 + (4+ 7? + (-1—2)2 = V47 


(iii) d(u, v) 


1.12. Find k such that d(u,v)=6 where u = (2,k,1,—4) and v = (8, —1,6, —3). 
(d(u, v))2 = (2—3)2 + (k+1)2 + (1—6)2 + (-44+3)2 = k2 + 2k + 28 


Now solve k2 + 2k + 28 = 62 to obtain k = 2, —4. 


1.13. Find the norm ||u/|| of the vector u if (i) u=(2,—7), (il) w= (3, —12, —4). 
In each case use the formula ||u|| = Vw2 w+ --- +02. 
(i) |jul| = V+ C7? = V44+49 = V53 
Gi) |lul] = V3? + (122+ (42 = Vo+ 144416 = 169 = 13 


lI 


10 _ VECTORS IN Rn AND C [CHAP. 1 


1.14. Determine k such that ||u|| = 39 where wu = (1,k,—2,5). 
|Jul]2 = 12+ k2 + (-2)2+ 52 = k2 + 30 


Now solve k2 + 30 = 39 and obtain k = 3, —3. 


1.15. Show that |{u||=0, and ||u||=0 iff w=0. 
By Theorem 1.2, uw-u=0, and uw-u=0 iff w=0. Since (eas = Vuru, the result follows. 


1.16. Prove Theorem 1.3 (Cauchy-Schwarz): 
For any vectors u = (w,...,Un) and v=(v1,...,Un) in R*, |u-v| = |u| |Jo]]. 


n 
We shall prove the following stronger statement: |u:v| = =, | M| = |lell llell- 


If «=0 or v=0, then the inequality reduces to 0 =0=0 and is therefore true. Hence we 
need only consider the case in which u # 0 and v ¥ 0, ie. where ||u|| # 0 and |lv|| # 0. 
Furthermore, 

[ee ay) SS eget Pr ee Oh ola, Ps lunYnl = & luni 
Thus we need only prove the second inequality. 
Now for any real numbers x,y € R, 0 = (4—y)2 = uw? — 2uy + y? or, equivalently, 


Qey = a2 + y? (1) 


Set « = |u,|/||u]| and y = |v,{//||v|| in (2) to obtain, for any 2, 


oil elie ae vl? 


Tale ~ eP ile (2) 


But, by definition of the norm of a vector, ||u|| = Su? = Su? and |lo|| = Sv? = S|,2. Thus 


summing (2) with respect to i and using |u,v,| = |u,| |v;|, we have 
Slur) Sle | Slo? _— ileal? del? 
I!e«1I [Tol] ||z«]|? lol? Teel |® ff? 
: > lari s 
arg [elle] 


Multiplying both sides by ||u/| ||v||, we obtain the required inequality. 


1.17. Prove Minkowski’s inequality: 
For any vectors u=(w,...,U%n) and v=(v1,...,Un) in R*, ||w + v|| = |/u| + Ilo]. 
If ||w+ || = 0, the inequality clearly holds. Thus we need only consider the case ||u+ || 4 0. 
Now |u;+,| = |u| +|v;| for any real numbers u;,v;ER. Hence 
let rl? = Bato)? = Flu tl? 
lug t vj] lueto] =  & lay t+ og (lee + oil) 
> luit vl lal + 3S lat al lo 


But by the Cauchy-Schwarz inequality (see preceding problem), 
Slut lel = |letoll lel] and Sa; +a] Jol = |let| [loll 
Thus lle ol]® = lft of] [feel] + |fee+ ol] lol] = |le+ ol] (lleel] + |loll) 


Dividing by ||w+ ||, we obtain the required inequality. 


CHAP. 1} VECTORS IN R" AND C 


1.18. Prove that the norm in R’ satisfies the following laws: 
[Ni]: For any vector wu, |lu||=0; and |u|] = 0 iff w= 0. 
[N2]: For any vector wu and any scalar k, ||ku|| = |k| ||2«!]. 
[Ns]: For any vectors uand v, ||u + v|| = |[u| + |Jo]]. 


[N,] was proved in Problem 1.15, and [N3] in Problem 1.17. Hence we need only prove that 


[N»] holds. 
Suppose wu = (uy, Up, .. -,U,) and so ku = (kuz, kug,...,ku,). Then 
[Ae]? = (Key)? + (Kuy)2? + +++ + (ku,)2 = berets mtg Keowee s Oe dete 
= Muugtugt-+++u2) = ke |lull 


The square root of both sides of the equality gives us the required result. 


COMPLEX NUMBERS 
1.19. Simplify: (i) (6+3i(2—7); (ii) (4—84)% (iii) 
(vi) (142895 (vii) ( 


1 oie oe 
3 — 4° 5 + 81 


“i 
ete 
(i) (8 F:3(2— 1). = 10 + 6¢— 351 — 212 = 31 — 29: 
(iy, 3(4 = 37)? = 16° = 247 + 97 = 9.7 — 24% 


bie ey (Bedi) ert Bi die Ba 
=a; BADGE 25 BB OE 
Gy ee COG 8) 08 1 — ai 1 
5+3t (54+31)(5—3i) — 34 Be Days) coeecy | 
(v) 8 = @e¢ = (-10 = -G # = BeR® = 1; BI = (HT+B = 17-(-i) = -1 
(vi) (1+2)8 = 14+ 6i+1224+88 = 1+ 61-12-81 = —11—2i 
(vii) 1 ce 1 t (—5 + 123) pounb ce n2e 1 A Be) Bae 5 
Vi \2—81/ ~ —B—12i ” (—5—12i)(-5 +12) ~~ 169 169 ' 169 


1.20. Let 2=2-—3i1 and w=4+ 51. Find: 
(i) e+ w and zw; (ii) z/w; (iii) 2 and w; (iv) |z| and |w|. 
(i) zetw = 2-81+44+5i = 6+2i 
zw = (2—31)(44+ 57) = 8—121+ 100-152 = 23 — 21 
IE am Ce) © eat) Se et Olea 2 a eae 2 2. 


OU abt = 4 B46) Al wariGae 
(iii) Use afbi=a—bi: 2=2—31=24+3i; W= 4TH = 4-54. 


(iv) Use |a+ bi =Var+R: lel = [2-84] = V4+9 = VIB; |w| = [4 + 5i] = VIE+ 25 = Vl. 


1.21. Prove: For any complex numbers z,w € C, 
(i) Fw =2+0, (ii) w=2, (iii) z=z. 
Suppose z=a+6i and w= c+di where a,b,c,dER. 
(i) zw = @+b) + (c+d) = (ate) + (+d)i 
(ate)—(b+dyti = ate— bi- di 
(a—bi)+(e—di) = 2+w 


lI 


(ii) zw = (at bil(e+di) = (ae— bd) + (ad + be)i 


= (ace—bd) — (ad+ bei = (a—bi\(e—di) = 2B 
Gi) 7 = a+bi =a—bi =a—(-bhi =at+bi=z 


FI 


FE NCV Ve toe ste es 


12 


1.22. 


1.23. 


VECTORS IN Rm AND Cr (CHAP. 1 


Prove: For any complex numbers z,w €C, |zw| = |a||w]. 
Suppose z=a+bi and w=c+di where a,b,c,d€&R. Then 
lel a2 62, eel? ee d?, and zw. = (ac— bd) + (ad + be)i 
Thus |jezw|2 = (ac— bd)? + (ad + be)? 
a2c2 — 2abed + b2d2 + a2d2 + 2abed + b?c? 
= a%(c2+ d2) + b2(c2+d2) = (a2+ b2)(c?2 + d?) = |z\? |\wl? 


The square root of both sides gives us the desired result. 


Prove: For any complex numbers z,w€C, |z+ w| =|z| + |w]. 


Suppose z=a+bi and w=c+di where a,b,c,d€R. Consider the vectors u= (a,b) and 
v = (c,d) in R?. Note that 


le] = Ver +E = ul, ol = VFR = Ipall 
and le+w| = \(ate)+(64+d)i| = V(atec)?+ (6+d)2 = |\(at+c,b+d)|| = |lu + a|| 
By Minkowski’s inequality (Problem 1.17), ||w+v|| = ||u|| + ||v|| and so 


Je+ wl = |le+oi[ = [fell + [loll = fel + feo 


VECTORS IN C” 


1.24. 


1.25. 


Let u = (3—2i, 44,1+6i) and v = (5+i,2—3i,5). Find: 
(i) wtv, (ii) 4iu, (iii) (1 +a, (iv) (1—-2)u+ (84d. 
(i) Add corresponding components: u+v = (8—1,2+17,6-+ 62). 
(ii) Multiply each component of uw by the scalar 47: 4iu = (8+ 127, —16, —24 + 47%). 
(iii) Multiply each component of v by the scalar 1 +7: 
(CeO = Sar ie ea = 6 Sy a oo) = (SEO, == Ob se GM) 


(iv) First perform the scalar multiplication and then the vector addition: 


lI 


(1—2i)u + (8 +iv = (-1—8i, 8+ 4i, 134 4) + (144 84, 9 — 72, 15 + 5%) 


(13, 17 — 34, 28 + 94) 


Find wv and v-u where: (i) uw = (1—2i1,3+%), » = (4+21,5—62); (ii) u = 
(3 —2i, 44,14 6i), v = (5 +4,2—3i, 7+2%). 
Recall that the conjugates of the second vector appear in the dot product: 


(Cid, Coton GOK osdg WS) = Giipar ooo te BW. 


(i) wey = (1-244 2%) + (8+ 06— 6) 
= (1—2i)(4—24 + (84+-(54+6/) = —10i+9+ 234 = 9+ 13: 


veu = (4421/1 —2i + (5-684) 
= (44 2i(1 + 2%) + (5—6i)(8—4) = 100+ 9 — 234 = 9 — 13% 


(ii) uev = (8—2i)(5+%) + (44)(2 — 32) + (1+ 61)(7 + 21) 
= (3 — 2i)(5 —12) + (42)(2 + 34) + (1+ 6i)(7— 2%) = 20+ 35% 

veu = (54+ %)(8 — 2%) + (2—31)(4%) + (7+ 21)(1 + 61) 
= (5 +4)(8 + 2%) + (2 — 3i)(—42) + (7+ 2i)(1— 67) = 20 — 352 


In both examples, v*u = %u*¥v. This holds true in general, as seen in Problem 1.27. 


CHAP. 1] VECTORS IN R” AND Cr 


1.26. Find ||u|| where: (i) w= (3+4i,5—2i,1— 3); (ii) w= (4—4, 27,34 2%, 1 — 5i): 


Recall that 22 = a2+b2 when z=a+bi. Use 
|||]? = wom = 242) + eo% + oo + Zeeman Wilh Gm teme—in (25ers aac) 
(i) [lw]? = (8)? + (4)? + (5)2 + (—2)2 + (1)2 + (-8)2 = 64, or |u|] = 8 


(ii) |fel|? = 42 + (—1)2 + 22 + 32 + 224 12+ (—5)2 = 60, or |lul| = V60 = 2V15 


13 


1.27. Prove: For any vectors u,v € C" and any scalar 2€C, (i) usv=v-u, (ii) (zu)ev= 


2(u-v), (iii) u-(2v) = 2(u-v). (Compare with Theorem 1.2.) 
Suppose wu = (21, %,...,%) and v = (Wy, Wo, ..., Wy): 


(i) Using the properties of the conjugate established in Problem 1.21, 


VU = Wyk + Wykg Fo F Why = Wey + Wyo + +++ + Wry 
== 424 1 Wyeo i Wwe, = 921 1 egWe 2 ey, = 
(ii) Since zu = (22, 2%, ...,2%n); 
(zu)°v = 22;W, + 2eoWy + °° + 2%,W, = 2(%1Wy + eqWo t+ +++ +2,W,) = 2(uev) 


(iii) Method 1. Since zv = (zwy, 2wo,..., ZW); 
WU (20) ee 1 ane Wott ie tee 1 eae ot ee tnt he UI 
= AZ Wyte oWo te a 2A, = Zee) 
Method 2. Using (i) and (ii), 


uc(zv) = (ev)eu = aveu) = Z(veu) = Z(urv) 


MISCELLANEOUS PROBLEMS 
1.28. Let u = (3,—-2,1,4) and v = (7,1,-8,6). Find: 


1,29. 


(i) w+; (ii) 4u; (iti) 2u—3v; (iv) wev; (Vv) ||x|| and |jo||; (vi) d(u, v). 
Gj) w+» = (84+7,-2+1,1—3,4+6) = (10, —1, —2, 10) 

(ii) 4u = (4°38, 4+(—2),4°1,4°4) = (12, —-8, 4, 16) 

(iii) 2a —3v = (6, —4, 2, 8) + (21, —3, 9, -18) = (—15, —7, 11, —10) 

(iv) wev = 21-2—3424 = 40 

(v) lui] = VoF4+1+416 = V30, |lol| = V49+1+9+36 = V95 

(vi) du,v) = V@—72 + (-2—1)2 + 1 +3)? + (4-62 = V45 = 3V5 


Let w = (7—2i,24+5%) and v = (1+%,—-8- 6%). Find: 

(i) w+; (ii) 2tu; (iii) (8 —7Z)v; (iv) us; (v) ||e|| and ||v||. 
G@) ute = (T—2+1+4,2+5i—-3—-6) = (8—i,-1-4 

(ii) Qiu = (14i—42, 4i+ 102) = (4+14%, —10 + 4%) 

(iii) (3-—av = (84+ 3i-i-#, —9— 181+ 31+ 67) = (4 + 2%, —15 — 152) 


(iv) wev = (7 — 2i)(1 + 1%) + (2 + 52)(—3 — 62) 
= (7—2i)(1—4) + (2+ 5t)(—3 + 61) = 5 — 91 — 86 — 3i = —381 — 121 


() |jl) = V+ Coe + +e = Vez, oll = VP +P + (8? + (6? = 47 


14 


1.30. 


1.31. 


1.32. 


1.33. 


VECTORS IN R* AND C” (CHAP. 1 


Any pair of points P= (a) and Q=(b)) in R" de- 
fines the directed line segment from P to Q, written Q 
PQ. We identify PQ with the vector v=Q-—P: 


—> 
PQ =.0- = (01= 1; bs — Oa, 20m) 


—> 
Find the vector v identified with PQ where: v Pp 


(i) P= (2,'5), “@ = (=38,4) 

(ii) P = (1, -2,4), Q = (6, 0, -8) 

(i) » = Q=P = (3=2,4—5) = (5, —1) 

Ci) OSS OS (OS Ua Pe) Des A) 


The set H of elements in R” which are solutions of a linear equation in nm unknowns 
%1,...,%n of the form . 
C101 FV Coo 8 * + Cn ln) = b’ (*) 
with w= (¢1,...,¢€n) ~0 in R’, is called a hyperplane of R", and (*) is called an equa- 
Hon of H. (We frequently identify H with (*).) Show that the directed line segment 
PQ of any pair of points P,Q € H is orthogonal to the coefficient vector wu; the vector 
u is said to be normal to the hyperplane H. 


Suppose P = (a,,...,@,) and Q=(b,...,6,). Then the a; and the b; are solutions of the 


given equation: 
CjQy = Coty te > ca, = 10; €1b; + cob + --> + ¢,b, = b 


— 
Let Vv = 1210) = Ons) = (i = Ghis iy —Cing 3 hog Ds i) 
Then OD, =O. (lO iis) ae OS =) ap OS se GK), tis.) 
= C10) = Cj Cn0o = Coda ae a C,0n a Cn Gy, 


== (Glenn Sa CU) el (hae Cline Poo ae Gyr) = = (Ih = 
~> 
Hence v, that is, PQ, is orthogonal to u. 


Find an equation of the hyperplane H in R‘ if: (i) H passes through P = (38, —2,1, —4) 

and is normal to u= (2,5,—6,—2); (ii) H passes through P= (1,—2,3,5) and is 

parallel to the hyperplane H’ determined by 4x — 5y + 22+ w = 11. 

(i) An equation of H is of the form 2x + 5y — 6z —2w =k since it is normal to u. Substitute 
P into this equation to obtain k = —2. Thus an equation of H is 2” + 5y — 6z — 2w = —2., 


(ii) H and H’ are parallel iff corresponding normal vectors are in the same or opposite direction. 
Hence an equation of H is of the form 4% — 5y + 2z+w =k. Substituting P into this equa- 
tion, we find k = 25, Thus an equation of H is 4% — By + 22 + w = 25. 


The line J in R® passing through the point P = (aj) 
and in the direction of w= (ui)~90 consists of the 
points X=P+tu,tER, that is, consists of the 
points X = (xi) obtained from 


1 + Ut 


uy 


Xe = Ao t+ Ut 


ofa) (Oxle ne: whe as ‘el 10 ce! 6 


= 
| 
i) 
3 
fle 
S 
= 
cK 


where ¢ takes on all real values. The variable ¢ is 
called a parameter, and (*) is called a parametric rep- 
resentation of I. 


CHAP. 1} VECTORS IN Rv AND Cx 15 


(i) Find a parametric representation of the line passing through P and in the direc- 
tion of u where: (a) P= (2,5) and w= (-3,4); (b) P=(4,-2,3,1) and u= 
(28os— 41); 


(ii) Find a parametric representation of the line passing through the points P and Q 
where: (a) P= (7,—2) and Q=(9,3); (b) P=(5,4,—-8) and Q=(1,-3,2). 


(i) In each case use the formula (*). 


Ay SS EE ORs 
Cae ot VS Pp Ot 
(a) (b) 
y = 5+4¢t Roe Bi ht 
(i Fes Ale alalte 


(In R? we usually eliminate t from the two equations and represent the line by a single 
equation: 4% + 3y = 28.) 


oe . ==> 
(ii) First compute u = PQ = Q—P. Then use the formula (4). 


(@) 2 — Ol — Pi (25,5) (0) ve" ie — (Ab) 
pops uray (es Ay OS bea ay 
eee aes i Ae 

Zo —=—=3 4 bt 


3 
(Note that in each case we could also write wu = QP = P— QQ.) 


Supplementary Problems 


VECTORS IN R” 


1.34. 


1.35. 


1.39. 


1.40. 


1.41. 


Let )u.= (1, —2,5), -v = (3,1, —2). Find: -(i).« + v;. Gi), —6u;,, (iii) 2a — 5u; (iv) wv; (v) {zl 
and ||v||;_ (vi) d(u,v). 


Thet e210 = 3),, or = (11, = 13)) a = (1, 3,252). Find: (i) (20 30; (ii) 5 = 30 = 4a; 
(iii) —u + 2v — 2w; (iv) uev,uew and vw; (v) d(u,v) and d(v,w). 


Let u = (2,1,—3,0,4), v = (5,—8,—1,2,7). Find: (i) w+; (ii) 8u—2v; (iii) wv; (iv) ||ul| 
and ||v||; (v) d(u, v). 


Determine k so that the vectors u and v are orthogonal. (i) wu = (3,k,—2), v = (6,—4,—3). (ii) w= 
(5,k, —4,2), v = (1,—8,2,2k). (iii) wu = (1,7,k+2,—2), v= (3, i383; Kt) 


Determine x and y if: (i) (#,«+y) = (y—2,6); (ii) #(1,2) = —Aly, 8). 
Determine x and y if: (i) (3,2) = 2(y,—1); (ii) x(2,y) = y(Q, —2). 


Determine «, y and z if: 
(ii) to 3, 3) = «(1, 1, 0) an y(0, 0, aL) ala 2(0, AF 1) 


Let e, =(1,0,0), e, = (0, 1,0), e3;=(0,0,1). Show that for any vector u=(a,b,c) in R?: 
(i) uw = ae, + beg + ces; (ii) wees =a, Ue eg = b, ureg = 6. 


16 VECTORS IN R” AND C” [CHAP. 1 


1.42. Generalize the result in the preceding problem as follows. Let e;€ R” be the vector with 1 in the 
ith coordinate and 0 elsewhere: 
Ca= (1,0, 0, oe 105, Os eg = (0,1, 0, oe », 0; 0); Tele 1 Os (0, 0,0, oe ig 05:2) 


Show that for any vector u = (a4, dp, ..., 4p), 
Oh, SE OROn am CO a Be SSC Gi), we, = a, “for alae 


1.48. Suppose uw €R” has the property that u*v=0 for every v € R”. Show that u=0. 


1.44, Using d(u,v) =||u—v|| and the norm properties [N,], [N2] and [N3] in Problem 1.18, show that 
the distance function satisfies the following properties for any vectors u,v,w € R*: 


(i) d(u,v) =0, and d(u,v)=0 iff w=; (ii) d(u,v) =d(v,u); (iii) d(u,w) = d(u, v) + div, w). 


COMPLEX NUMBERS | 
1.45. Simplify: (i) (4—7i)(9+20; (ii) (8—54)2; (iii) st (iv) ae (v) (1—4)3. 


Sa RM algo Rang os gape RAT Mee er Set HiaEe. fons 
1.46. Simplify: (i) aD (ii) 7 8a (iii) 715, 725, 734; (iv) Bs 


147. Let z=2-—57 and w=7+3i. Finds (i) z+; (ii) zw; (iii) 2/w; (iv) Z, W; (v) |al, |]. 


1d oeLet +e — 24+%- and av = 6 — 67. -Finds (i) 2/ws. (i) -%, 2% Gi) 42), alae 


1.49. Show that: (i) zz~!=1; (ii) |z| = |2|; (iii) real part of z = 3(z+ 2); (iv) imaginary part of 
= (a BPM 


1.50. Show that zw=0 implies z=0 or w=0. 


VECTORS IN C” 
151. Let w= (1+, 2—62) and v = (5—27,3—42). Find: (i) w+ v3 (ii) (8+2)u; (iii) 2 + (4—T)0; 
(iv) wev and vew; (v) |{z|| and ||o||. 


152. Let’ w= (3= Ti, 21, —L +2) -and’. v = (4—4, 11 +27, 38—39. “Find: (i) a= v7) “Gi)e (Gia) esa 
usv and veu; (iv) |\wl| and |||. 


1.53. Prove: For any vectors u,v,w € C”: 
(i) Utv)sw = uewt+vew; (ii) we(utv) = weut+wev. (Compare with Theorem 1.2.) 


1.54. Prove that the norm in C” satisfies the following laws: 


[Nj]: For any vector u, ||u|| = 0; and ||u|{=0 iff w«=0. 


[N2]: For any vector « and any complex number z, ||zu|| = |z| |{2||. 
[Na]: For any vectors u and v, ||jwt+ v|| = |lu|| + |lol|. 
(Compare with Problem 1.18.) 


MISCELLANEOUS PROBLEMS 
1.55. Find an equation of the hyperplane in R® which: 
(i) passes through (2, —7,1) and is normal to (8,1, —11); 
(ii) contains (1, —2, 2), (0,1,3) and (0,2, —1); 
(iii) contains (1,—5,2) and is parallel to 3” — Ty + 4z = 5. 


1.56. Determine the value of k such that 2% — ky + 42 —5w = 11 is perpendicular to 7a + 2y—z2+ 
2w = 8. (Two hyperplanes are perpendicular iff corresponding normal vectors are orthogonal.) 


CHAP. 1] VECTORS IN R” AND Cx 17 


1.57. 


1.58. 


1.34, 


1.36. 


1.52. 


Find a parametric representation of the line which: 

(i) passes through (7, —1, 8) in the direction of (1,3, —5) 

(ii) passes through (1,9, —4,5) and (ZrO 0e a) 

(iii) passes through (4,—1,9) and is perpendicular to the plane 3x — 2y+ z= 18 


Let P, Q and R be the points on the line determined by 
Vy = Ay tUuyt, yo = Apt ut, ..., Wy = Ont Unt 


which correspond respectively to the values t,, t, and ts; for t. Show that if t, < t,< ts, then 
dP, Q) + d(Q,R) = d(P,R). 


Answers to Supplementary Problems 


Gi) utv= (4, —1, 3); Gi —6u=—(— 6, 12" —30); (id) 22 on 9820): (iv) usv = —9; 
(v) |u|] = V30, |v] = V14; (vi) d(u,v) = 62 


(i) 2u—3v = (1,1,3,—15); (ii) 5u—3v —4w = (8, -14, 11, —32); (i) Su-- 20° = 2w-— (—2'— 7,-2.5): 
(iv) uw-v=—6, ucw=—T, vew=6; (v) d(u,v) = V38, d(v,w) = 3V2 


(i) wt+v = (7,—2,—4,2,11); (ii) 34—2v = (—4,9, —7, —4, -2);- (iii) we v = 88; © (iv) |lal| = 30, 
\|v|| = 2722; (v) d(u,v) = 42 


Gj ie =16; (ii) k= 3; " (ii) k = 3/2 

(() aS SZ) SS BY 

@) ==, P= SSR (i) eS, SO oe P= FS! 

GQ) ei one — 2 (I) el fl — al ee 

We have that u*u=0 which implies that u = 0. 

(i) 50 — 55%; (ii) —16 — 307; (iii) (44+. 71)/65; (iv) (1+3%)/2; (v) —2 — 21. 
Gj) —gi; (ii) (6+ 27%)/58; (iii) —i,7,—1; (iv) (4+ 37)/50. 


(i) ztw=9-2i; (ii) zw = 29 — 29%; (iii) 2/w = (-1—411)/58; (iv) 2 = 2+ 54, W=T— 31; 


(v) Jz] = V29, |w| = V58. 


(i) 2/w =(7+16i/61; (ii) 2 = 2-1, = 6+ 5%; (iii) |2e| = V5, |w| = V61. 


If zw =0, then |zw| = |z||w| = |0| = 0. Hence |zZ|=0 or |w|=0; and so z=0 or w=90. 
(i) w+tv = (6+5i, 5—10%) (iv) wev = 21 + 27i, vow = 21 27i 

(ii) (8+du = (—4 +224, 12 — 164) (v)  ||ul] = 3V10, |lv|| = 3v6 

(iii) 2iu + (4—Tiv = (-8 — 41i, —4 — 83%) 

Gy w= v= (—1— 64 —11, —9 + 42) (iii) woe = 12+ 27, vou = 12-21 

(Gi) (8+ 40 = (18 +4, 81 +17, 27—4) (iv) |[e|| = 8, |lv|| = V215 


(i) 8e +y—11z = —12; (ii) 18a t+4yt2z2= 7; (iii) 38a — Ty + 4z = 46. 


k=0 
(i) rah hie steeC (ii) AL aS (iii) Ce roe, 
Yee ifs Ol ae Yt ae 

z= 9+ 


Chapter 2 


Linear Equations 


INTRODUCTION 


The theory of linear equations plays an important and motivating role in the subject 
of linear algebra. In fact, many problems in linear algebra are equivalent to studying a 
system of linear equations, e.g. finding the kernel of a linear. mapping and characterizing 
the subspace spanned by a set of vectors. Thus the techniques introduced in this chapter 
will be applicable to the more abstract treatment given later. On the other hand, some of 
the results of the abstract treatment will give us new insights into the structure of “con- 
crete” systems of linear equations. 


For simplicity, we assume that all equations in this chapter are over the real field R. We 
emphasize that the results and techniques also hold for equations over the complex field C 
or over any arbitrary field K. 


LINEAR EQUATION 
By a linear equation over the real field R, we mean an expression of the form 

City +) Cede be aha 0 (1) 
where the ai,b € R and the x; are indeterminants (or: unknowns or variables). The scalars 
a: are called the coefficients of the x; respectively, and 0 is called the constant term or simply 
constant of the equation. A set of values for the unknowns, say 

M1 = ki, v2 = ko, weey Ln = kn 
is a solution of (1) if the statement obtained by substituting ki for xi, 

Qakey + Geko +: + nly = 6 
is true. This set of values is then said to satisfy the equation. If there is no ambiguity 
about the position of the unknowns in the equation, then we denote this solution by simply 
the n-tuple 

UuU = (ki, ko, Set ot Kn) 


Example 2.1: Consider the equation «+ 2y—4z2+w= 3. 
The 4-tuple u = (3,2,1,0) is a solution of the equation since 
aie Abs “= ALO) Ae) 9 SS 8} or 3=83 
is a true statement. However, the 4-tuple v = (1,2,4,5) is not a solution of the 
equation: since.” » 44 ged 2a die, egal carer eee 
is not a true statement. 
Solutions of the equation (1) can be easily described and obtained. There are three 
cases: 


Case (i): One of the coefficients in (1) is not zero, say ai “0. Then we can rewrite the 
equation as follows 


Qits = b= Gols =.2>> —/Ontas OF. Ui, Oye D Oia Oot ee ear 


18 


CHAP. 2] LINEAR EQUATIONS 19 


By arbitrarily assigning values to the unknowns 2, ...,2n, we obtain a value for 21; these 
values form a solution of the equation. Furthermore, every solution of the equation can 
be obtained in this way. Note in particular that the linear equation in one unknown, 


Cia spas Withee O 
has the unique solution « = a~'b. 
Example 2.2: Consider the equation 2% — 4y +z = 8. 
We rewrite the equation as 
2 Say ae or ew = 4+ 2y— tz 
Any value for y and z will yield a value for x, and the three values will be a solution 


of the equation. For example, let y=3 and z=2; then w = 4+2°3 — $°2= 9. 
In other words, the 3-tuple u = (9,3,2) is a solution of the equation. 


Case (ii): All the coefficients in (1) are zero, but the constant is not zero. That is, the 
equation is of the form 


Oni 0xe +: +07, = b,- with b+ 0 


Then the equation has no solution. 


Case (iii): All the coefficients in (1) are zero, and the constant is also zero. That is, 
the equation is of the form 
O94 +0024 = 07, = 0 


Then every n-tuple of scalars in R is a solution of the equation. 


SYSTEM OF LINEAR EQUATIONS 


We now consider a system of m linear equations in the nm unknowns 1, ..., Xn: 
1101 + Gi2%e +--+ + Gintn = 01 
Q2i%2 + Aee%2 +++: +Qontn = de (*) 
* 
AmiX1 + Am2X%2 +e + Amn%n = Dix 


where the aij, bi belong to the real field R. The system is said to be homogeneous if the con- 
stants b1,...,bm are all 0. An n-tuple w= (ki, ..., kn) of real numbers is a solution (or: 
a particular solution) if it satisfies each of the equations; the set of all such solutions is 
termed the solution set or the general solution. 


The system of linear equations 


Gi1t1 + Oyet, $e + dint, =: 0 
Qoi%1 + Ae2e%2 +--+ +Aentn = 0 (x*) 
Am1%1 ai OL ROH Dei FID + Amntn = 0 


is called the homogeneous system associated with (*). The above system always has a solu- 
tion, namely the zero n-tuple 0 = (0,0, ..., 9) called the zero or trivial solution. Any 
other solution, if it exists, is called a nonzero or nontrivial solution. 


The fundamental relationship between the systems (*) and (**) follows. 


20 LINEAR EQUATIONS [CHAP. 2 


Theorem 2.1: Suppose wu is a particular solution of the nonhomogeneous system (*) and 
suppose W is the general solution of the associated homogeneous system (**). 


Th 
ea UW Ea ae ay 


is the general solution of the nonhomogeneous system (*). 


We emphasize that the above theorem is of theoretical interest and does not help us to 
obtain explicit solutions of the system (*). This is done by the usual method of elimination 
described in the next section. 


SOLUTION OF A SYSTEM OF LINEAR EQUATIONS 


Consider the above system (*) of linear equations. We reduce it to a simpler system as 
follows: 


Step 1. Interchange equations so that the first unknown «1: has a nonzero coeffi- 
cient in the first equation, that is, so that au #0. 
Step 2. For each 71>1, apply the operation 
Li) > Sty bi aeiila 
That is, replace the ith linear equation L; by the equation obtained by mul- 


tiplying the first equation Li by —au, multiplying the ith equation Li by 
d11, and then adding. 


We then obtain the following system which (Problem 2.13) is equivalent to (*), ie. has 
the same solution set as (*): 


tA 

Qii%1 + Aiete + Aists + -°: + Ginkn = D1 
rd be 

35% ig a POE Oe +> Ointin = D2 

, 

AminXiy = Eh Ree OY me ae Onules De 


where a11%0. Here x;, denotes the first unknown with a nonzero coefficient in an equation 
other than the first; by Step 2, x;,“2:. This process which eliminates an unknown from 
succeeding equations is known as (Gauss) elimination. 


Example 2.3: Consider the following system of linear equations: 
Pag a UY) eee SES ae OLR SS I 
SMa OW) ape SO se 2p SS 
bby Se Lae ar De UP Ss 


We eliminate the unknown « from the second and third equations by applying the 
following operations: 


Ly > —38L, ae 2L5 and Ds 2 —20, P Ls 


We compute oli) 1-08 ae levis) — 00) 1G 20n— 
2Lp: ie a PO) sie Pa) Se) SS INL 
—8L, + 2Lo: Be SSO a A) Se Al 
and OR Vat NR lO) Pap a ih) = ly) sas S19) 
Les: Miami teat eae IO) = a) =) | 83 


Ve) by aj bye See Sed == GD Ss oo aL 


CHAP. 2] LINEAR EQUATIONS 21 


Thus the original system has been reduced to the following equivalent system: 
20+ 4y—z2+2v+2w = 1 
Bi) ae AOD = Syl 
Be ap =a) = Al 


Observe that y has also been eliminated from the second and third equations. Here 
the unknown z plays the role of the unknown jy above. 


We note that the above equations, excluding the first, form a subsystem which has 
fewer equations and fewer unknowns than the original system (*). We also note that: 


(i) if an equation 0+ -:-- +02, = 6, b40 occurs, then the system is incon- 
sistent and has no solution; 


(ii) if an equation 0%,:+--:-+0a,=0 occurs, then the equation can be deleted 
without affecting the solution. 


Continuing the above process with each new “smaller” subsystem, we obtain by induction 
that the system (*) is either inconsistent or is reducible to an equivalent system in the 
following form 


Osa Cpe Uae Oyehle or n+ = isis t yp eke nicl oes + Aintn = di 
A2inVig + O2,jo+1Vjgt1 bess + Qentn = de 
(#**) 
Arj,j, + Or,j,+1%j,41 + 19° + Arntn = D; 
where 1 < j2<--- <7, and where the leading coefficients are not zero: 
a1 ~ 0, Aj, ~ 0, Asal 3.0 rj, ~ 9 


(For notational convenience we use the same symbols aij, bx in the system (***) as we used 
in the system (*), but clearly they may denote different scalars.) 


Definition: The above system (***) is said to be in echelon form; the unknowns «; which 
do not appear at the beginning of any equation (14 1,j2,...,7,) are termed 
free variables. 


The following theorem applies. 


Theorem 2.2: The solution of the system (***) in echelon form is as follows. There are 
two cases: 


(i) r=n. That is, there are as many equations as unknowns. Then the 
system has a unique solution. 


(ii) r<n. That is, there are fewer equations than unknowns. Then we 
can arbitrarily assign values to the n—r free variables and obtain a 


solution of the system. 


Note in particular that the above theorem implies that the system (***) and any equiv- 
alent systems are consistent. Thus if the system (*) is consistent and reduces to case (ii) 
above, then we can assign many different values to the free variables and so obtain many 
solutions of the system. The following diagram illustrates this situation. 


22 


Inconsistent 


LINEAR EQUATIONS [CHAP. 2 


System of linear equations 


Unique More than 
solution one solution 


No 
solution 


In view of Theorem 2.1, the unique solution above can only occur when the associated 
homogeneous system has only the zero solution. 


Example 2.4: 


Example 2.5: 


Example 2.6: 


We reduce the following system by applying the operations Ly > —3L, + 2L, and 
Lz ~ —8L,+ 2L3, and then the operation L3; > —3L, + Ls: 


PLES HV) = Pegs Hh) = 8 PHB Sasi rae ote Si) eA 2a yer 2k te ena: 
BR Se ZS BaP POD Af enh Wie aie — ao 
Be} = BY) 42 84 = BD =D Wise WAR — Vy Ores 8 


The equation 0=—8, that is, 0«+0y+0z+0w =—8, shows that the original 
system is inconsistent, and so has no solution. 


We reduce the following system by applying the operations Ly, > —L, + Ip, 
Lz, > —2L,+ LD, and Ly > —2L,+ L4, and then the operations L3; > Ly — Lg 
and Ly > —2L, + Lg: 


Apae PAY Be a Ul Liston O cee A: Sa AYE Gyo a Lh 
Qa ev) ax 2 = ahh Vai ge SS NG 4A ee, 
OAR a SAY — eg eee INS) Yrs = 5. Dge = 
20 OVE neces 2y | 82) 14 C=) 
Vitae aaa Oe eee 
Op ate = 
Zee, 


Observe first that the system is consistent since there is no equation of the form 
0= 6, with b6+#0. Furthermore, since in echelon form there are three equations 
in the three unknowns, the system has a unique solution. By the third equation, 
z=1. Substituting z= 1 into the second equation, we obtain y= 3. Substitut- 
ing z=1 and y=3 into the first equation, we find «=1. Thus «=1, y=8 
and z=1 or, in other words, the 3-tuple (1, 3,1), is the unique solution of the 
system. 


We reduce the following system by applying the operations Ly > —2L,+ L, and 
L; > —5L,+L3, and then the operation L3; > —2L, + Ls: 


CY aloe Apes VAP Pd a Oy h eS Ne aS Ni Be se Sh = 
rage Cpe Byese) Alyy = = PH) SI Ze a a 
Bye Se ny) teks Spa Pg tly) ee 9 0 = 0 


| 
bo 


Hee PR ee oe BS 


zZ— 20 = 1 


CHAP. 2] LINEAR EQUATIONS 23 


The system is consistent, and since there are more unknowns than equations in 
echelon form, the system has an infinite number of solutions. In fact, there are 
two free variables, y and w, and so a particular solution can be obtained by giving 


y and w any values. For example, let w=1 and y=-—2. Substituting w=1 
into the second equation, we obtain z— 3. Putting w=1, z=3 and y=-—2 
into the first equation, we find « =9. Thus «= DS Yl ae, ee =o and) 20 = or 


in other words, the 4-tuple (9, —2, 3, 1) is a particular solution of the system. 


Remark: We find the general solution of the system in the above example as follows. 
Let the free variables be assigned arbitrary values; say, y=a and w=b. 
Substituting w= b into the second equation, we obtain z=1+2b. Putting 
y=a,2=1+2b and w=bD into the first equation, we find x =4—2a+b. 
Thus the general solution of the system is 

Ce Ae Ad 0,2 Yo. ee Oe ap. =D 


or, in other words, (4—2a+b,a,1+2b, b), where a and b are arbitrary num- 
bers. Frequently, the general solution is left in terms of the free variables y 
and w (instead of a and b) as follows: 

te — OY ea 2 = Lt Qa or (4—2y+w, y,1+2w, w) 


We will investigate further the representation of the general solution of a 
system of linear equations in a later chapter. 


Example 2.7: Consider two equations in two unknowns: 
aut by = ¢; 
aot + boy = Cy 
According to our theory, exactly one of the following three cases must occur: 
(i) The system is inconsistent. 
(ii) The system is equivalent to two equations in echelon form. 
(iii) The system is equivalent to one equation in echelon form. 


When linear equations in two unknowns with real coefficients can be represented 
as lines in the plane R2, the above cases can be interpreted geometrically as follows: 


(i) The two lines are parallel. 
(ii) The two lines intersect in a unique point. 


(iii) The two lines are coincident. 


SOLUTION OF A HOMOGENEOUS SYSTEM OF LINEAR EQUATIONS 


If we begin with a homogeneous system of linear equations, then the system is clearly 
consistent since, for example, it has the zero solution 0=(0,0,...,0). Thus it can always 
be reduced to an equivalent homogeneous system in echelon form: 


(GOD yd OM bp yim et IE US ONO ROR OREN TS N ev tndne == 0 
A2j,X in + 2, j9+1Xjy+1 ae 2D Go De + Oentn = 0 
Ar}, Ui, + Or, j,+10j,41 Bh oi i Un =. 0 


Hence we have the two possibilities: 
(i) r=n. Then the system has only the zero solution. 


(ii) r<mn. Then the system has a nonzero solution. 


If we begin with fewer equations than unknowns then, in echelon form, r<n and 
hence the system has a nonzero solution. That is, 


24 


LINEAR EQUATIONS (CHAP. 2 


Theorem 2.3: A homogeneous system of linear equations with more unknowns than 


equations has a nonzero solution. 


Example 2.8: The homogeneous system 
Oi Df Oe Ob aN) 


= SY ee 2 oO 
Dee a — OS aD tU = a) 


has a nonzero solution since there are four unknowns but only three equations. 


Example 2.9: We reduce the following system to echelon form: 
Rae Ve eS pays gave {Y) Be i) 
PAS = SOP eg = AY SS = Yi) ap ee = 
ig = Oa ae By, = NY) —by + of). = 0 


The system has a nonzero solution, since we obtained only two equations in the 
three unknowns in echelon form. For example, let z= 5; then y=3 and x =2. 
In other words, the 3-tuple (2, 3, 5) is a particular nonzero solution. 


Example 2.10: We reduce the following system to echelon form: 


OO | a fe) ae Wi = 2 == Citi) 
PAG Se ES og OY) PA) ap fe Ns AV) 7A eva (3) 
3e + 2y+ 22 = 0 —Y 1020 = a0 inc) 


Since in echelon form there are three equations in three unknowns, the system has 
only the zero solution (0, 0, 0). 


Solved Problems 


SOLUTION OF LINEAR EQUATIONS 


2.1. 


2k — oy +62 + 20, — 50 = 3 
Solve the system: y—4z+ 4 Ber Oe 
1) erat Y) Yine=anar 


The system is in echelon form. Since the equations begin with the unknowns x,y and v re- 
spectively, the other unknowns, z and w, are the free variables. 


To find the general solution, let, say, z= a and w=b. Substituting into the third equation, 
DP Ne Say: or Y= 2-+.3b 
Substituting into the second equation, 
Nah een Sey I or Yi =O 0 san: 
Substituting into the first equation, 
2% — 3(4a — 3b —1) + 6a + 2(2+ 3b) — 5b 


I| 
oo 


or C—O Uke, 
Thus the general solution of the system is 
C= 802 5b = Ze yo dai Sb — ie 2 Oe OD eer 


or (3a—5b—2, 4a—3b—1,a,2+ 3b, b), where a and b are arbitrary real numbers. Some texts 
leave the general solution in terms of the free variables z and w instead of a and b as follows: 


CHAP. 2] LINEAR EQUATIONS 25 


2.2. 


2.3. 


x = 3— 5w— 2 
y = 4¢-—3w-1 or (382 — Bw — 2, 4z — 8w —1, 2, 2+ 3, w) 
Vi 2) 3w 


After finding the general solution, we can find a particular solution by substituting into the 
general solution. For example, let a =2 and b=1; then 


CN ey 4 oe 5) = or (ak 2s, bays, al) 


is a particular solution of the given system. 


tit wy 2 o26=u— | 
Solve the system: 37— y+2z= 7. 
ox + 3y—4z2.= 2 


Reduce to echelon form. Eliminate x from the second and third equations by the operations 
Ly — —3L, ar Lo and Lz =. —5L, ae Lz: 


ine Se SCs Ve = 8 — Soe yy pa ye ly 
Lo: SUEY) een el, Ls: ba F 38y— 4¢-= 2 
—3L, + Lo: Sy ce lal = 10) —5L, + Ls: AY ate Ne 


Thus we obtain the equivalent system 
op PA) = oe = =I 


Sse ie = ie 
Si ap lie = 


The second and third equations show that the system is inconsistent, for if we subtract we obtain 
Ox + Oy +0z = 3 or 0=3. 


Ete LO 
Solve the system: 3x2%+2y+2z = 1. 


l| 
iy 


5a + 4y.+ 32 


Reduce to echelon form. Eliminate x from the second and third equations by the operations 
Ly =>. —3L, + 2L, and Lg asics —5L, = 2L 3: 


SIT iO oy) 1 62) == 30 ON ise eal O50 Yate OZ =n 5) 
2L 5: 6a + 4y+ 42 = 2 2L3: 10%+ 8y + 62 = 8 
—8L, + 212: y +102 = —28 —5L, + 2L3: 3y +162 = —42 


Thus we obtain the following system from which we eliminate y from the third equation by the 
operation Lz; > —3L,2 + Lz: 


20 + y —-22 =.- 10 ip SO) ee SS NY 
ae suis = a7's) to Yo 102g. —=2— 28 
3y - L625 —=— "42 Ae wa AD, 


In echelon form there are three equations in the three unknowns; hence the system has a unique 
solution. By the third equation, z=-—8. Substituting into the second equation, we find y= 2. 
Substituting into the first equation, we obtain x = 1. Thus «=1, y=2 and z=~—3, ie. the 3-tuple 
(1, 2, —3), is the unique solution of the system. 


26 


2.4. 


2.5. 


LINEAR EQUATIONS [CHAP. 2 


Bib 2ayp—wa2 = 1.G 
Solve the system: 2%-— yt+4z2= 2. 
4x + 3y—2z = 14 


Reduce the system to echelon form. Eliminate x from the second and third equations by the 
operations Ly, > —2L,+L, and Lz; > —4L, + Lz: 


—2L,;: —2e—4y+ 62 = —12 —4L;: —4a —8y+12z2 = —24 
Ly: Wipe ap Ag SS 2 Ls: ACO) = oe 
—Spprloe = ls, —5y +102 = —10 

or BS DP 2 or = 22 = 2 


Thus the system is equivalent to 


Bae 7) et = CE 
Cita oe iO) 
Oy he SU or simply 
Ye — 225 ee 
US Pe 2 


(Since the second and third equations are identical, we can disregard one of them.) 


In echelon form there are only two equations in the three unknowns; hence the system has an 
infinite number of solutions and, in particular, 3—2 = 1 free variable which is z. 


To obtain the general solution let, say, z=a. Substitute into the second equation to obtain 
y =2+2a. Substitute into the first equation to obtain #2 + 2(2+2a)—8a =6 or wx =2-— a. 
Thus the general solution is 


ee fie Ot ee or (2) Gy ta a) 
where a is any real number. 


The value, say, a =1 yields the particular solution x =1, y=4,2=1 or (1,4,1). 


e-8y+4z2-—2w = 5 
Solve the system: ZY 1202. ah, 
4 


II 


y — 82 


The system is not in echelon form since, for example, y appears as the first unknown in both 
the second and third equations. However, if we rewrite the system so that w is the second unknown, 
then we obtain the following system which is in echelon form: 


Aa PN OS Ves Ce. ee 5s 
Uae rei se Gee SS 7? 
Np 3Y3 Se a! 


Now if a 4-tuple (a, b, c, d) is given as a solution, it is not clear if b should be substituted for 
w or for y; hence for theoretical reasons we consider the two systems to be distinct. Of course this 
does not prohibit us from using the new system to obtain the solution of the original system. 


Let z=a. Substituting into the third equation, we find y=4+3a. Substituting into the 
second equation, we obtain w+ 2(4+3a)+5a = 2 or w=-—6—11a. Substituting into the first 
equation, 

x — 2(—6 — ila) — 8(4+ 8a) + 4a = 5 or pe ra ay Saki 


Thus the general solution of the original system is 
¢ = 56-—1la, yy = 4+ 84a; 2 = 4, wi =) —6— lla 


where a is any real number. 


CHAP. 2] LINEAR EQUATIONS 27 


2.6. 


meds 


Determine the values of a so that the following system in unknowns x,y and z has: 
(i) no solution, (ii) more than one solution, (iii) a unique solution: 


24 +3y +az = 3 
Di UY ooh 2 


Reduce the system to echelon form. Eliminate x from the second and third equations by the 
operations L, > —2L,+L, and L; > —L, + Ls: 


—2L,;; —2e% —2y+ 2z = -2 aa bgss SS a a 
Lo: 2% + 3y + az = 8 L3: 45 oe QY ote 
ita act) ca — | (GS iar ee al 
Thus the equivalent system is 
CP ae V— z= 1 
Mae (Gar Ze ==. 1 
(GI Se Aa 


Now eliminate y from the third equation by the operation L; > —(a—1)L, + Lz, 
—(a—1)L5: —(a—l)y + (2—a—a2)z = 1—a 
Lz: (a—1)y + Azam! 


(6—=a—2a2)z = 2\—«a@ 
or (8+a)\(2—a)z = 2-—a 
to obtain the equivalent system 
OR a Wp Ze 
Yio (a+ 2)z = 11 
(8 + a)(2 —a)z 2a 


which has a unique solution if the coefficient of z in the third equation is not zero, that is, if a #2 
and a#*-—8. In case a = 2, the third equation is 0 =0 and the system has more than one solu- 
tion. In case a = —3, the third equation is 0 =5 and the system has no solution. 


Summarizing, we have: (i) a=—8, (ii) a=2, (ili) a#2 and a¥-—3. 


Which condition must be placed on a, b and c so that the following system in unknowns 


x, y and z has a solution? 
xet2y-— 32 = 


a 

2x + 6y—1llz = b 

Pe eye he 

Reduce to echelon form. Eliminating x from the second and third equation by the operations 
L, > —2L,+L, and L3; > —L, + Lz, we obtain the equivalent system 
Liat 2Y oe =e 

2y — bz =-b— 2a 

—4y +102 = c-—a 


Eliminating y from the third equation by the operation L3; > 2L,+ Ls, we finally obtain the 


equivalent system 
e+ 2y—32 = a 


2y — bz =". b = 2a 
c= 


| 
ic) 
+ 
bo 
o 
on 
a 


28 LINEAR EQUATIONS (CHAP. 2 


The system will have no solution if the third equation is of the form 0 = k, with k#0; that is, 
if c+2b—5a ~ 0. Thus the system will have at least one solution if 


c+2b—5a = 0 or Gh == Anse © 


Note, in this case, that the system will have more than one solution. In other words, the system 
cannot have a unique solution. 


HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONS 


2.8. Determine whether each system has a nonzero solution: 
e+2y-—- z2=0 


x—2y+3z2-—2w = 0 e+ 2y —32 = 0 204 DY + 22) en O. 
3x4 — Ty —2z2+4w = 0 2x + 5y +22 = 0 x+4yt+Tz = 0 
4x + 3y +52+2w = 0 Si HY Ae 0 e+3y+32 = 0 


(1) (ii) : (iii) 
(i) The system must have a nonzero solution since there are more unknowns than equations. 


(ii) Reduce to echelon form: 


ae Pipa oe = VY) Dar Th = eae = Oot WOnf a oa) 
Wi ey) ae PG = NY) to Yares = WV to Vans. = 
Bu SS I) =Unacig = (I) Giza 0 


In echelon form there are exactly three equations in the three unknowns; hence the system has 
a unique solution, the zero solution. 


(iii) Reduce to echelon form: 


ay ae PA) Se SY Liao ee 
Pep Soy) Ge ve == Il) Ys Ae = 180 Mp ps) ho. (I) 
Woe Qps= Te == 19) Pay a= ike - == () Vian He = 10) 
oP ey aoe == Ut Ages) 


In echelon form there are only two equations in the three unknowns; hence the system has a 
nonzero solution. 


2.9. The vectors w,...,%m in, say, R” are said to be linearly dependent, or simply 
dependent, if there exist scalars ki,...,km, not all of them zero, such that 
kit1 +--+ +kmtm = 0. Otherwise they are said to be independent. Determine 
whether the vectors u, v and w are dependent or independent where: 

(reer (1 ea (28d) ue (See Cael) 
(1) a l= 2) 3), 0 = (25:8; 1), w= 48).2) 1) 
(iii) w = (di, dz), v = (bi, be), w = (C1, C2) 


In each case: 
(a) let xu + yu + zw = 0 where w,y and z are unknown scalars; 


(b) find the equivalent homogeneous system of equations; 


(c) determine whether the system has a nonzero solution. If the system does, then the vectors are 
dependent; if the system does not, then they are independent. 


(i) Let wu+yv+ zw = 0: 
w(1;, Fel) big, <3, d)eae(8a— 7D) 


(0, 0, 0) 
or (x, x, —*x) a (2y, —3y, y) os (82, —T2, 2) = (0, 0, 0) 


or (e-F 2y 7:82, 0:— 3Y =e, eo ye) (O08) 


CHAP. 2] LINEAR EQUATIONS 29 


2.10. 


Set corresponding components equal to each other and reduce the system to echelon form: 


oF 2y + 82 = 0 Caley tee oee 0 OS POL ae ye = — Ba Pao tye == () 
i > Sh io 8) OU eeel 2a) Dara = Maree = 0 
Cette =n () aise Os 0) Dee se () 


In echelon form there are only two equations in the three unknowns; hence the system has a 
nonzero solution. Accordingly, the vectors are dependent. 


Remark: We need not solve the system to determine dependence or independence; we only need 
to know if a nonzero solution exists. 


(ii) i235) Yea. — 1) = 8-2.) — a0 (0505 0) 
(Gy a8, 230), AZ OY, Hy) 4° (3%, 22,2) => (0,205.0) 
(e- 2y + 32, 2a 3y+ 22, —8ai— yt 2). =|1 (0, 0; 0) 
ay Geb ar Bee == ag ar Py) oe 7 B¥4 = (0 Marta ap ay = 
=a oP a ap Pe SS Ne oc ae 0) Wises. = (Y) 
= = op BS Siar Ne = 0 302 — 0 


In echelon form there are exactly three equations in the three unknowns; hence the system has 
only the zero solution. Accordingly, the vectors are independent. 


(iii) (1, My) + Yy(by, by) + 2(¢y, C2) = (0, 0) 


b = 
(A4%, Ax) + (byy, boy) + (1%, Coz) = (0, 0) and so Crea Raat: Y 


aoe + boy + coz = 0 
(ayx + byy + C42, agx + boy + coz) = (0, 0) 


The system has a nonzero solution by Theorem 2.3, i.e. because there are more unknowns than 
equations; hence the vectors are dependent. In other words, we have proven that any three 
vectors in R? are dependent. 


Suppose in a homogeneous system of linear equations the coefficients of one of the 
unknowns are all zero. Show that the system has a nonzero solution. 


Suppose #,,...,%, are the unknowns of the system, and «; is the unknown whose coefficients 
are all zero. Then each equation of the system is of the form 


yay ptieee + Q;—1%j—1 ate 0x; ar Qj4yUjty 4 OO ar AnXy = 0 


Then for example (0,...,0,1,0,...,0), where 1 is the jth component, is a nonzero solution of each 
equation and hence of the system. 


MISCELLANEOUS PROBLEMS 


2.11. 


Prove Theorem 2.1: Suppose uw is a particular solution of the homogeneous system 
(+) and suppose W is the general solution of the associated homogeneous system (**). 


Then Wt VV Ait a ee Wot 


is the general solution of the nonhomogeneous system (*). 


Let U denote the general solution of the nonhomogeneous system (*). Suppose u © U_ and that 


Uu = (u4,..-,Un). Since u is a solution of (*), we have for. 7s, M1, 
AjyUy + Ajgtly + °°* + AinUn = 1; 


Now suppose w€ W and that w= (wj,.. .,W,). Since w is a solution of the homogeneous system 


(#*), we have for 71=1,...,m, 
AyWy, a AjgWo2 aio SOE an Ann = 0 


30 


2.12. 


2.13. 


LINEAR EQUATIONS [CHAP. 2 


Therefore, for i=1,...,m, 


jy (Uy + Wy) + ajp(Uy + We) + +++ + Gin(Un + Wn) 


= AjyUy + AjyWy + Ajgtte + Ajgwy + ++* + Ainln + GinWn 

= (ayy + aig + +++ + Gintln) + (ayWy + AigQW2 + *** + GinWn) 

= b; +0 = b; 
That is, w+ w is a solution of (*). Thus u+w€U, and hence 

ae My © te 
Now suppose v = (v;,...,¥,) is any arbitrary element of U, i.e. solution of (*). Then, for 
1a Oe es 
QV} + AjoVo je OO ee BinYn = b; 


Observe that v=u+(v—u). We claim that v-—uweEW. For i=1,...,m, 


Gi3(Vy — Uy) + Aig(Va— Ug) + °** + Gin(Un — Un) 
= (jy Vy + Ajgdg + +++ + AinVn) — (AjyUy + GjgQla + ++ * + AinUn) 
= b; a b; = 0 


Thus v — wis a solution of the homogeneous system (*), i.e. v-—u€ W. Then v G€u+W, and hence 
UNG esa 


Both inclusion relations give uu U=u+W; that is, w+W is the general solution of the 
nonhomogeneous system (**). 


Consider the system (*) of linear equations (page 18). Multiplying the ith equation 
by ci, and adding, we obtain the equation 


(iis + °° * + CmOmi)¥1 + 2 AE (Cidin FF ** Cindmn)an — 9010 a 
Such an equation is termed a linear combination of the equations in (*). Show that 
any solution of (*) is also a solution of the linear combination (1). 

Suppose uw = (k,,...,k,) is a solution of (*). Then 
Qik + Gk + +++ + ayk, = 5, =) (2) 
To show that wu is a solution of (1), we must verify the equation 
(yay Foes HCG + oes HF (eytgy + o5* + Cm bmn)ty = 040, F-** + Cn dm 
But this can be rearranged into 
Cj(Qyjihey bh 8 + Gyyky) Fei eh Oiny FS ae Olea = Oy ce rene 
or, by (2), OO SOR Se AO = Gl) ap PSO FR Oa... 


which is clearly a true statement. 


In the system (*) of linear equations, suppose ai11.40. Let (#) be the system ob- 
tained from (*) by the operation Li> —diali+anLi, i~1. Show that (*) and (#) 
are equivalent systems, i.e. have the same solution set. 

In view of the above operation on (*), each equation in (#) is a linear combination of equations 
in (*); hence by the preceding problem any solution of (+) is also a solution of (#). 

On the other hand, applying the operation L; > = (—a;,L, + L;) to (#), we obtain the origi- 


nal system (*). That is, each equation in (*) is a linear combination of equations in (#); hence each 
solution of (#) is also a solution of (#). 


Both conditions show that (*) and (#) have the same solution set. 


CHAP. 2} LINEAR EQUATIONS 31 


2.14. Prove Theorem 2.2: Consider a system in echelon form: 


2.15. 


Qi1X1 + Are%e + Ayglg + rrr reece eee ees aia Oy 
A2jnVj, + Oe, j.+1%jt1 + rss: + Gentn = be 
Arj,Xj, ae Qr,j, +10j,41 oF Oe ae ArnXin = b, 
where 1 <j2<--- <j, and where a+ 0, d23,~0, ..., Arj3,~0. The solution is as 


follows. There are two cases: 


(i) r=n. Then the system has a unique solution. 


(ii) r<m. Then we can arbitrarily assign values to the n—r free variables and 
obtain a solution of the system. 


The proof is by induction on the number r of equations in the system. If r= 1, then we have 
the single linear equation 


Q4%1 + Agty + Ag%g + +++ + a,x, = 0D, where a, + 0 
The free variables are %,...,2,. Let us arbitrarily assign values to the free variables; say, 
2 =k, 3 = kg, ..., %, = ky. Substituting into the equation and solving for 2, 
i 
ca = ag Oe als ++ = Apkn) 


These values constitute a solution of the equation; for, on substituting, we obtain 
1 
ay | F = aah = +++ = ayy) | + ae + ge Oi, = b or C—O 


which is a true statement. 

Furthermore if r=n=1, then we have az = 6, where a+ 0. Note that x = b/a is a solu- 
tion since a(b/a) = 6b is true. Moreover if x =k is a solution, i. ak = 6b, then k= b/a. Thus 
the equation has a unique solution as claimed. 

Now assume r>1 and that the theorem is true for a system of r—1 equations. We view the 
r—1 equations 


25% j, + Ay, jg +12 jg +1 ee a Ao, Lt, = by 
yj Hj 1 Oyj 410i t1 ttt + An %n = 6, 
as a system in the unknowns Wigs oo ey Une Note that the system is in echelon form. By induction 


we can arbitrarily assign values to the (x — j, +. 1) —(r—1) free variables in the reduced system 
to obtain a solution (say, tj. = Kins ..+) U, =k,). As in case r=1, these values and arbitrary 


values for the additional j, —2 free variables (say, 2%) =k», ..., Cre ae k;,— 1), yield a solution 


of the first equation with 


1 
Ds alg — Gyoky — +++ — Aynkn) 
11 
(Note that there are (n — jp + 1) — (r—1) + (jg — 2) = n—r free variables.) Furthermore, these 
values for 2,...,%, also satisfy the other equations since, in these equations, the coefficients of 
Wy, +++) %j,—1 are Zero. 


Now if r=n, then j. = 2. Thus by induction we obtain a unique solution of the subsystem 
and then a unique solution of the entire system. Accordingly, the theorem is proven. 


A system (*) of linear equations is defined to be consistent if no linear combination 
of its equations is the equation 
0%: + Ore +---+0%, = b, where b~0 (1) 


Show that the system (*) is consistent if and only if it is reducible to echelon form. 
Suppose (*) is reducible to echelon form. Then it has a solution which, by Problem 2.12, is a 
solution of every linear combination of its equations. Since (1) has no solution, it cannot be a linear 
combination of the equations in (*). That is, (*) is consistent. 
On the other hand, suppose (*) is not reducible to echelon form. Then, in the reduction process, 
it must yield an equation of the form (1). That is, (1) is a linear combination of the equations in (*). 
Accordingly (*) is not consistent, i.e. (*) is inconsistent. 


32 LINEAR EQUATIONS [CHAP. 2 


Supplementary Problems 


SOLUTION OF LINEAR EQUATIONS 


‘ PHA Se BN) ES Ih 24+ 4y = 10 AMG PA), =) 0) 
2.16. Solve: (i) y (ii) (iii) i 
bye ap (ay = 30 Ol — 6 4 oye 
2.17 Solve: 
DG a Mp —ipys = 15) 2e.-+ 8y — 22 = '5 e429 +) oe, = 
(GQ) Sa) 2 ye tee ao (ii) o— 2y + 32 = 2 (iii) 2% + 38y+ 82 = 
Bye; By) 1S) Mig psp Cig 1) SUS ice 
2.18. Solve: 
208-3 Hee Pay) —= Sea ae PA) = Map PA = Bap BY Sa 
(i) e— 24 = (ii) 2¢+ 5y— 82+ 6w = 5 (iii) 2¢% + 4y+ 4z¢+3w = 9 
84a +. 2y = Bear AO) = liye ae yan =) 2 Soa OY aa ee Ot — 
ea 2y + 22 = 
ipas bp on Ae ley == &} 
Bp = PS 
19. lve: i ii Bie ale ee Se 
2.19 Solve (i) Deny ay tang (ii) x y z+ b5w 
PE Sie 2 Se 4g) Sl 
et+4y+6z = 0 LU ea 


2.20. Determine the values of k such that the system in unknowns 2, y and z has: (i) a unique solution, 
(ii) no solution, (iii) more than one solution: 


IEG ae OM) oe ae = 
GG) a ap Ip ae al (b) 
“e+ y + kz 1 


a 
20 hoe iS 


lI 


2.21. Determine the values of k such that the system in unknowns 2, y and z has: (i) a unique solution, 
(ii) no solution, (iii) more than one solution: 


e+ ytkze = 2 x =e = =2 
(CQ) eS Cty oe (0) Re on ake ae 
PHD Bir SW) ep AL Get eae in all 


2.22. Determine the condition on a, b and ¢ so that the system in unknowns x, y and z has a solution: 


Ae EN Bye = Re = ap a tld SE ah 
QQ) Be OP Re sty (i) 20h ee ee 
R= Wise CR = wv Sa pas Wo 


HOMOGENEOUS SYSTEMS 


2.23. Determine whether each system has a nonzero solution: 


LEA Wp rs =. (0) CO SY) es c= Al) Cit 2Yy)— bela) 
(i) u— 8y¥+ 8 = 0 GOS Sip ae es 0 (ili), eon Oyatt ae) eon am) 
30 — 2y) a 42, =. 0 Stl ea ee 2) LG ae ea) = (0) 


CHAP. 2] LINEAR EQUATIONS 33 


2.24, 


2.25. 


Determine whether each system has a nonzero solution: 


Gi PAY ae Mp OV) 2x —4y + 7z+ 40—5w = 
PAB POLY] ah pad oe 
sx 4y.— (62 = 


(i) 90 8Y) tee — (TO 
i 
D0 2 — Oe ow 


Qo ©. & 
SS Sy SS 


30° — ly 122 = 6x2 — by + 4z’— 30 — 2w = 


Determine whether the vectors u, v and w are dependent or independent (see Problem 2.9) where: 


(1,3, —-1), v = (2,0,1), w = (3551) 


II 


(i) 4 


lI 


(ii) wu (inl 1), ar 525.1, 0) 55 = 1 (=1;-1) 2) 


(iii) w = (1, —2, 3,1), v = (8, 2,1, —2), w = (1, 6, —5, —4) 


MISCELLANEOUS PROBLEMS 


2.26. 


2.27. 


2.28. 


2.29. 


Consider two general linear equations in two unknowns « and y over the real field R: 


Oar i) = 2 
Qo aU. = 
Show that: 
SOE ae EO te ee ; : wee le 20H 
(i) if —As, ie. if ad—be # 0, then the system has the unique solution « = ————, 
Crane ad — be 
_ af —ce 
Yad = be’ 
Bae Be b e : 
(1) wit = meee a ?? then the system has no solution; 
(iii) if - = 4 = 2 then the system has more than one solution. 
Consider the system Te a> Dap = 


CLE Ove — 0 


Show that if ad — bc #0, then the system has the unique solution « = d/(ad — be), y = —e/(ad — be). 
Also show that if ad—be =0, c#0 or d#0, then the system has no solution. 


Show that an equation of the form 0x, + 0x, +--+ + Ox, = 0 may be added or deleted from a 
system without affecting the solution set. 


Consider a system of linear equations with the same number of equations as unknowns: 


Ay, %, + Ay9Xo i =f Ain®n = b; 
Gini %y + Agg%o + +<* + Oont, = 52 «) 
OER plo ae ae ae Only — b, 


Suppose the associated homogeneous system has only the zero solution. Show that (1) has a 
unique solution for every choice of constants }j. 


(i) 


Suppose the associated homogeneous system has a nonzero solution. Show that there are 
constants 6; for which (1) does not have a solution. Also show that if (1) has a solution, then 


(ii) 


it has more than one. 


34 


2.17. 


2.18. 


2.19. 


2.20. 


2.21. 


2.22. 


2.23. 


2.24. 


2.25. 


LINEAR EQUATIONS 


Answers to Supplementary Problems 


(i) «=2, y= -1; (ii) « =5—2a, y=a; (iii) no solution 


ee 
(i) (1, —3, —2); (ii) no solution; (iii) (—1— Ta, 2+ 2a, a) or ft SS lee 


Gj) #=3, y=-1 
é Op ata —z + 2w 
RS cet 2B O20 Ab a. OR ae ere ea ions 


x 7/2 — 5w/2 — 2y 
z2 = 1/2+4+ w/2 


II 


(iii) (7/2 — 5b/2 — 2a, a, 1/2 + 6/2, b) or { 
(i) (2,1, —1); (ii) no solution 


(a) (i) K#1 and k#—-2; (ii) k =—2; (iii) K=1 


(b) (i) never has a unique solution; (ii) k = 4; (iii) k #4 


(a) (i) k #8; (ii) always has a solution; (iii) k = 3 
(6) Gi) kx 2) and k++ —5; (ii) k = —53 (iil) k= 2 


(i) 2a—b+c = 0. (ii) Any values for a, b and c yields a solution. 
(i) yes; (ii) no; (iii) yes, by Theorem 2.3. 
(i) yes; (ii) yes, by Theorem 2.3. 


(i) dependent; (ii) independent; (iii) dependent 


[CHAP. 2 


Chapter 3 


Matrices 


INTRODUCTION 


In working with a system of linear equations, only the coefficients and their respective 
positions are important. Also, in reducing the system to echelon form, it is essential to 
keep the equations carefully aligned. Thus these coefficients can be efficiently arranged in 
a rectangular array called a “matrix”. Moreover, certain abstract objects introduced in 
later chapters, such as “change of basis”, “linear operator” and “bilinear form”, can also 
be represented by these rectangular arrays, i.e. matrices. 


In this chapter, we will study these matrices and certain algebraic operations defined on 
them. The material introduced here is mainly computational. However, as with linear 
equations, the abstract treatment presented later on will give us new insight into the 
structure of these matrices. 


Unless otherwise stated, all the ‘‘entries” in our matrices shall come from some arbitrary, 
but fixed, field K. (See Appendix B.) The elements of K are called scalars. Nothing essen- 
tial is lost if the reader assumes that K is the real field R or the complex field C. 


Lastly, we remark that the elements of R” or C” are conveniently represented by “row 
vectors” or “column vectors”, which are special cases of matrices. 


MATRICES 
Let K be an arbitrary field. A rectangular array of the form 
Qi1 Ariz Ain 
G21 22 en 
Am1 Am2 Amn 


where the a; are scalars in K, is called a matrix over K, or simply a matrix if K is implicit. 
The above matrix is also denoted by (aij), 7=1,...,m,j=1,...,”, or simply by (di). 
The m horizontal n-tuples 


(au, OTS, Sas); (G21, CE aan Gen), a6 ON (Ami, Am2, .+ +5 Qmn) 


are the rows of the matrix, and the n vertical m-tuples 


Ai Q12 Qin 

Qo1 O22 Qen 
, oe ’ ’ 

Am1 Am2 Amn 


are its columns. Note that the element aij, called the ij-entry or ij-component, appears in 
the ith row and the jth column. A matrix with m rows and n columns is called an m by 1 
matrix, or m Xn matrix; the pair of numbers (m,7n) is called its size or shape. 


35 


36 MATRICES [CHAP. 3 


1'—3 4 
Example 3.1: The following is a 2 X 3 matrix: ( 5 oe ; 


1 =3 
Its rows are (1, —8, 4) and (0, 5, —2); its columns are i ; ( 4 and (ae i 
Matrices will usually be denoted by capital letters A,B,..., and the elements of the 
field K by lower case letters a,b,.... Two matrices A and B are equal, written A=B, if 
they have the same shape and if corresponding elements are equal. Thus the equality of 
two m Xn matrices is equivalent to a system of mn equalities, one for each pair of elements. 


2 ; 
Example 3.2: The statement ( ee a eee = = ( i re is equivalent to the following system 
Coy 2-—W 


of equations: 


3 
1 
22-+ WwW = 5 
4 


The solution of the system is «7 =2, y=1, z=3, w==—Il. 


Remark: A matrix with one row is also referred to as a row vector, and with one column 
as a column vector. In particular, an element in the field K can be viewed as 
a 1x1 matrix. 


MATRIX ADDITION AND SCALAR MULTIPLICATION 


Let A and B be two matrices with the same size, i.e. the same number of rows and of 
columns, say, m X n matrices: 


Q11 Ai2 Qin Dit Dio Din 

Aoi QA22 a2 bax Doo be 
Ane a and a # 

Am1 Am2 Amn Dm be Dian 


The sum of A and B, written A+ B, is the matrix obtained by adding corresponding entries: 


Qu+ bi dig + Diz es Cte Oty 
ANeh pots G21 + bor = xa. + Doe wae Cone Osx 
Am1 ar Dnt Omz2 Ar Dime cee Amn ae Ore 


The product of a scalar k by the matrix A, written k- A or simply kA, is the matrix obtained 
by multiplying each entry of A by k: 


kay kai. wes kain 

ka: ka wen oe FOOL 
kA = 21 22, 2n 

kami kAme KAmn 


Observe that 4+ B and kA are also m Xn matrices. We also define 
—Av= —1°A) and” A = B) —VAy+ (8) 


The sum of matrices with different sizes is not defined. 


CHAP, 3] MATRICES 37 


Example 3.3: et— A -= G ee ) and B =a Sass ae Then 


4 5 -6 —7 1 8 
1+3 -—2 = 
Parapet Sa = eect, ui 4-2 5 
4—F 5+1-6+8 — 3 ot Gino 
3°1 -3*(—2) 8°3 an 3.6 9 
SACy = = 
38°4 3°5 3° (—6) 12 15 —18 
BAN 5 90s pe tea hex: Bosh eae) galeaebases 
8 10 —12 21 -—3 —24 29 7 —86 
Example 3.4: The m X n matrix whose entries are all zero, 
ORO M ce tg. 0) 
0 0 0 
ORO 2 ee sor () 


is called the zero matrix and will be denoted by 0. It is similar a the scalar 0 in 
that, for any mXn matrix A = (a), A+0= (aj + 0) = (aj) = 


Basic properties of matrices under the operations of matrix addition and scalar pute 
plication follow. 


Theorem 3.1: Let V be the set of all m X n matrices over a field K. Then for any matrices 
A,B,C €V and any scalars ki, ke € K, 


(i) (A By C= 4 4 (BC) (vy) k(A+B) = bA+hB 
(ii) A+0= A (vi) (ki thn)A = nA+hboA 
(iii) A +(—A) = 0 (vii) (Keiks)A = Ka(k2A) 
(iv) A+B= B+A (viii) 1-4 =A and 0A=0 


Using (vi) and (viii) above, we also have that A+A=2A,A+A+A=83A, 


Remark: Suppose vectors in R” are represented by row vectors (or by column vectors); 


Bays Ue — (dis Weses-. On) and WD = "(b4,\b2; oes, Dn) 


Then viewed as matrices, the sum w+ v and the scalar product ku are as follows: 
U+v = (ditbi, d2tbs,...,dn+bn) and ku = (kas, kas, ..., kan) 
But this corresponds precisely to the sum and scalar product as defined in 


Chapter 1. In other words, the above operations on matrices may be viewed 
as a generalization of the corresponding operations defined in Chapter 1. 


MATRIX MULTIPLICATION 
The product of matrices A and B, written AB, is somewhat complicated. For this 
reason, we include the following introductory remarks. 


(i) Let A=(a) and B= (bj) belong to R", and A represented by a row vector and B by a 
column vector. Then their dot product A: B may be found by combining the matrices 


as follows: b 
1 


b 
A:B aa (di, Gaya. Cn) a == aib; + debe + Gay + Anbn 


Dn 


Accordingly, we define the matrix product of a row vector A by a column vector B as 


above. 


38 


(ii) Consider the equations 


(iii) Now consider the equations 


MATRICES [CHAP. 3 


by1%1 + Dyote + Vises = Yt (2) 
boix1 + boowe + Do3%3 = Ye 


This system is equivalent to the matrix equation 


1 
bs Bie Dig Vis | tala Ys or simply BX = Y 
bor be2 das 3 Y2 


where B= (bj), X =(a) and Y=(y), if we combine the matrix B and the column 
vector X as follows: 


bir Die a3 oe bi1t1 + Di2%2 + Disx3 a By X 
BX = 0 NO ie Petty Shs, 6 
bor be das ts bo1a1 + Do2a%e + bo3%3 2 
where B; and B2 are the rows of B. Note that the product of a matrix and a column 


vector yields another column vector. 


Q11Yi + Aye = % 
(2) 


21Y1 + A2xY2 = 22 


which we can represent, as above, by the matrix equation 


ee lee = - orsimply AY = Z 
M21 Az2 Y2 ee 


where A = (aij), Y = (yi) as above, and Z = (zi). Substituting the values of y: and y2 
of (1) into the equations of (2), we obtain 


Qii(Dir%1 + Dive + bis%3) + A12(bo1%1 + Dov%2 + Dozs%3) = 21 
Goi(bi1%1 + Die%2 + D133) + Ae2(b2i%1 + Do2%2 + be3%3) = 22 
or, on rearranging terms, 
(Qi1bi1 + Qi2b21)41 + (QirDi2 + Gigb22)%2 + (Ai1bis + Aizb23)%3 = 21 
(GaiD11 + Azeb01)%1 + (A2idi2 + A22b22)%2 + (Aoibizs + A22b23)¥3 = 22 (3) 


On the other hand, using the matrix equation BX =Y and substituting for Y into 


AY =Z, we obtain the expression 
AB XS=_Z, 


This will represent the system (3) if we define the product of A and B as follows: 


AB is Pee Die _ oe Gee Ai1D12 + Ar2b22 ae 


Q21 Ase boi bee dos 21011 + A22b21 Aeibi2 + A22b29 A21d13 + Arebo3 


ee A,:B' Ai: B? A,:° B 

i Ao: B! Ao: B? co 
where A; and A: are the rows of A and B', B? and B® are the columns of B. We em- 
phasize that if these computations are done in general, then the main requirement is 
that the number of y; in (1) and (2) must be the same. This will then correspond to the 
fact that the number of columns of the matrix A must equal the number of rows of 


the matrix B. 


CHAP. 3] MATRICES 39 


With the above introduction, we now formally define matrix multiplication. 


Definition: Suppose A = (ai) and B= (bi)) are matrices such that the number of columns 


of A is equal to the number of rows of B; say, A is an m Xp matrix and B is a 
v0 xn matrix. Then the product AB is the m xn matrix whose aj-entry is 
obtained by multiplying the ith row A; of A by the 7th column Bi of B: 


A;:B! Ai: B? A;:B 
AB = As: B} A>: B? Az: Br 
Avie bi Ay Bs Am:* B” 
That is, 
Qi Qip bit by Din C11 Cin 
ain Qip = Cij 
Am1 Amp Dp1 bo Opn Cm1 Cmn 
D 
where Cj = Qi1b1; + Ai2bo; ftisee Hf DipDp; = > Aik dxj. 
k=1 


We emphasize that the product AB is not defined if A is an mx p matrix and B is a 
q Xn matrix, where p ~ q. 


Example 3.5: ce © ay a2 43 s ra, + sb; radyg+sby raz + sb 
ie Ob by, by bg ta, + ub; ta ae Uubs tag a ubs 

2; iI 1-1 +2°0 d°1+2-2 
Example 3.6: Z = ) eS Leenb 
4 Ome relma erd OMS bir a2 Bal 
il 1 a a Sul ebORi Ss iLopeae olor 4 6 
2 3 A Bk ( ibs Oe; Ce ab wow ie ie 


The above example shows that matrix multiplication is not commutative, i.e. the products 
AB and BA of matrices need not be equal. 


Matrix multiplication does, however, satisfy the following properties: 


Theorem 3.2: (i) (AB)C = A(BC), (associative law) 
(ii) A(B+C) = AB+AC, (left distributive law) 
(iii) (B+C)A = BA+CA, (right distributive law) 
(iv) k(AB) = (kKA)B = A(kB), where k is a scalar 
We assume that the sums and products in the above theorem are defined. 
We remark that 0A = 0 and BO =0 where 0 is the zero matrix. 


TRANSPOSE 
The transpose of a matrix A, written At, is the matrix obtained by writing the rows of 
A, in order, as columns: 


ai Qi2 a Nee Qi1 Qazi Am1 
21 22 Aen ye Qi2 QA22 Am2 
Ami Am2 Amn Qin Qan Amn 


Observe that if A is an m Xn matrix, then A‘ is an n X m matrix. 


40 MATRICES [CHAP. 3 


1 2 BALE i 5 
Example 3.7: e an ve = psa Ned} 
B = ¢ 


The transpose operation on matrices satisfies the following properties: 
Theorem 3.3: (i)i> (A+B) = At Bt 
(ln) (CAs) 
(iii) (kA)' = kA', for ka scalar 
(iv) (AB)' = BtAt 


MATRICES AND SYSTEMS OF LINEAR EQUATIONS 


The following system of linear equations 


1101 + Arte +--+ + Qintn = D1 
oi. + Aoete + +++ + Aentn = de (1) 
AmiX1 + Ame%2 + -°* + Amen = Din 


is equivalent to the matrix equation 


ai A12 Werke Qin v1 by 
Q21 Qe22 65 a x . 

: ai ema G2 orsimply AX = B (2) 
Ami Am2 ue as Amn Xn Din 


where A = (ai), X =(x) and B=(b;)). That is, every solution of the system (1) is a 
solution of the matrix equation (2), and vice versa. Observe that the associated homogeneous 
system of (1) is then equivalent to the matrix equation AX = 0. 


The above matrix A is called the coefficient matrix of the system (1), and the matrix 


ai1 Ai2 Shue Cin by 
Qi A22 Aan be 
Ami Am2 Amn On 


is called the augmented matrix of (1). Observe that the system (1) is completely determined 
by its augmented matrix. 


Example 3.8: The coefficient matrix and the augmented matrix of the system 
PERSE SY] = ER SY 
a) Py = yg SS 
are respectively the following matrices: 
2 3! 4 2 Sh et! 4 
1-2 -5 - € —2 -5 8 
Observe that the system is equivalent to the matrix equation 
Go SSAA hd ey 
1 Deb ere eee 
Zz 


In studying linear equations it is usually simpler to use the language and theory of 
matrices, as indicated by the following theorems. 


CHAP. 3] MATRICES 41 


Theorem 3.4: Suppose U1, U2,...,Un are solutions of a homogeneous system of linear 
equations AX =0. Then every linear combination of the u; of the form 
Kyu + ken +++++ inten where the k; are scalars, is also a solution of 
A030. Thus ain particular, every multiple ku of any solution wu of 
AX =0 is also a solution of AX =0. 


Proof. We are given that Au; = 0, Au2=0,...,Aum=0. Hence 


A(kus + kuz +--+ + kttn) = kiAur + koAus + +++ + knAUn 
k0+h0+-:---+k,0 = 0 
Accordingly, kywi+-+-+:+knu, is a solution of the homogeneous system AX = 0. 


II 


Theorem 3.5: Suppose the field K is infinite (e.g. if K is the real field R or the complex 
field C). Then the system AX =B has no solution, a unique solution or 
an infinite number of solutions. 


Proof. It suffices to show that if AX =B has more than one solution, then it has 
infinitely many. Suppose wu and v are distinct solutions of AX =B; that is, Aw=B and 
Av=B8. Then, for any ke K, 

A(u+k(u—v)) = Au+k(Au—Av) = B+k(B-B) =B 
In other words, for each k EK, u+k(u—v) is a solution of AX =B. Since all such solu- 
tions are distinct (Problem 3.31), AX=B has an infinite number of solutions as 
claimed. 


ECHELON MATRICES 

A matrix A = (ai) is an echelon matriz, or is said to be in echelon form, if the number 
of zeros preceding the first nonzero entry of a row increases row by row until only zero 
rows remain; that is, if there exist nonzero entries 


Bij, Adin, -- +5 Aris where ji<jo< °°: <7 
with the property that 
G20 forest = 7,9 <i, nand for 727 
We call a1j,, ..., aj, the distinguished elements of the echelon matrix A. 
Example 3.9: The following are echelon matrices where the distinguished elements have been 
circled: 
ya3 20 4b 6 GQrenrs eG) s< woe cog ayo 
o707@ 1-3 2 0 0 0 @ 6 00) G) 0 -3>-0 
6 6107670 © 2 0 0 6 ee (ma 
6 8b 6 0 0 ooo 0 0 0 0 0 ee © 


In particular, an echelon matrix is called a row reduced echelon matrix if the dis- 


tinguished elements are: 
(i) the only nonzero entries in their respective columns, 


(ii) each equal to 1. 
The third matrix above is an example of a row reduced echelon matrix, the other two are 
not. Note that the zero matrix 0, for any number of rows or of columns, is also a row 


reduced echelon matrix. 


ROW EQUIVALENCE AND ELEMENTARY ROW OPERATIONS 


A matrix A is said to be row equivalent to a matrix B if B can be obtained from A by a 
finite sequence of the following operations called elementary row operations: 


42 MATRICES [CHAP. 3 


[HZ]: Interchange the ith row and the jth row: Rio fj. 
[H2|: Multiply the ith row by a nonzero scalar k: Ri > kRi, k ~ 0. 
[EZ]: Replace the ith row by k times the jth row plus the ith row: Ri> kR; + Ri. 


In actual practice we apply [2] and then [Fs] in one step, i.e. the operation 
[HZ]: Replace the ith row by k’ times the jth row plus k (nonzero) times the ith row: 
R; oo k’R; + kRi, k40. 


The reader no doubt recognizes the similarity of the above operations and those used 
in solving systems of linear equations. In fact, two systems with row equivalent aug- 
mented matrices have the same solution set (Problem 3.71). The following algorithm is 
also similar to the one used with linear equations (page 20). 


Algorithm which row reduces a matrix to echelon form: 


Step 1. Suppose the 7; column is the first column with a nonzero entry. Inter- 
change the rows so that this nonzero entry appears in the first row, that is, 
so that a1;, ~ 0. 


Step 2. For each 1>1, apply the operation 
Ri > =ay,Ri + dy, Ri 


Repeat Steps 1 and 2 with the submatrix formed by all the rows excluding the first. 
Continue the process until the matrix is in echelon form. 


Remark: The term row reduce shall mean to transform by elementary row operations. 


Example 3.10: The following matrix A is row reduced to echelon form by applying the operations 
R,>-—2R,+R, and R,;>-—8R,+Rs3, and then the operation R3 > —5R,+ 4R3: 


It Ash 0) 1 ee 0) 12 Sea O 
Asien | 2 iA oa to On nO gar 2 COLSON sO Wired aia 
STay ula abe es Oss0F eb aeaes OF Ose One, 


Now suppose A=(ai) is a matrix in echelon form with distinguished elements 
Q1j,,--+,Qrj,. Apply the operations 


Uae tae —0n;,Ri he 0i;Rx, aa ae 
for 1=2, then i=8,...,i=r. Thus A is replaced by an echelon matrix whose dis- 
tinguished elements are the only nonzero entries in their respective columns. Next, multiply 


Ri by ai, ‘,i=r. Thus, in addition, the distinguished elements are each 1. In other words, 
the above process row reduces an echelon matrix to one in row reduced echelon form. 


Example 3.11: On the following echelon matrix A, apply the operation R, > —4R,+3R, and then 
the operations R,>R3;+R, and R,>—5R,+ 2Ry: 


2S) Lae DiawnG Oe OO) tenn Gre hO% NORM eeenO) 
Algor Ooty Oot. 3| sae ee (Oi) tO OL. Ono 5 ta eo. to" OK! Of SG ate 
OF Osa Ol Onna 2 05 30, OS ae One 00 Oe Orr Ome 2 


Next multiply R, by 1/6, R, by 1/6 and R, by 1/2 to obtain the row reduced echelon 
matrix 


13/2 0) 7/60 
Ot Oma 2/ 35.0 
One 0 SO Ore ol. 


The above remarks show that any arbitrary matrix A is row equivalent to at least one 
row reduced echelon matrix. In the next chapter we prove, Theorem 4.8, that A is row 
equivalent to only one such matrix; we call it the row canonical form of A. 


CHAP, 3] MATRICES 43 


SQUARE MATRICES 


ie ea with the same number of rows as columns is called a square matrix. A square 

z os with n rows and n columns is said to be of order n, and is called an n-square matrix. 

3 e diagonal (or: main diagonal) of the n-square matrix A = (ai;) consists of the elements 
11, 22, ..., Ann. 


Example 3.12: The following is a 3-square matrix: 


Aa fe 
co oOo bd 
OD w 


Its diagonal elements are 1, 5 and 9. 


An upper triangular matrix or simply a triangular matrix is a square matrix whose 
entries below the main diagonal are all zero: 


Ont URE Pans 6 eis Otines C12 paces On 
0 Qe2 A2 a a 

n or 22 2n 
0 0 Ann Ann 


Similarly, a lower triangular matrix is a square matrix whose entries above the main 
diagonal are all zero. 


A diagonal matrix is a square matrix whose non-diagonal entries are all zero: 


a O ae 0 a 
0 ade 0 we 2 
Os 720 te Sea. > ahs 


In particular, the n-square matrix with 1’s on the diagonal and 0’s elsewhere, denoted by In 
or simply J, is called the wnit or identity matrix; e.g., 


oH & 
Lo KS 


1 
Ts = 0 
0 


This matrix J is similar to the scalar 1 in that, for any n-square matrix A, 
Ales TAt= A 


The matrix kI, for a scalar k € K, is called a scalar matrix; it is a diagonal matrix whose 
diagonal entries are each k. 


ALGEBRA OF SQUARE MATRICES 


Recall that not every two matrices can be added or multiplied. However, if we only 
consider square matrices of some given order 1, then this inconvenience disappears. Specif- 
ically, the operations of addition, multiplication, scalar multiplication, and transpose can be 
performed on any ” Xn matrices and the result is again an n X n matrix. 


In particular, if A is any n-square matrix, we can form powers of A: 
A?= AA, A?=A2A,... and A°=1 
We can also form polynomials in the matrix A: for any polynomial 


f(z) a Qo + aie + Aon? + fe, 8 he + Anx” 


44 MATRICES [CHAP. 3 


where the a; are scalars, we define f(A) to be the matrix 
f(A) = aol + a1A + a2A? +--+ + an,A” 


In the case that f(A) is the zero matrix, then A is called a zero or root of the polynomial f(x). 


1p eo Seon ee Epa ow iO : 
Example 3.13: Let A=, mae then A? = Bh A 9 aig) ON 2p aooae 


If. f(a) = 272— 38+ 5, then 
7-6 It ODORS A GCRO iE gee 16 Ee 
ences 2(_. 2B) A 3(5 —4 hg BP acs fae 61 
Tio(e))p 172-1 on 0; then 
USS Ieee 1 t) 0 . 
(A) e a " € cs) es 1 (0 0 


Thus A is a zero of the polynomial g(x). 


INVERTIBLE MATRICES 
A square matrix A is said to be invertible if there exists a matrix B with the property 
that 
AB-=—BAL=E 
where I is the identity matrix. Such a matrix B is unique; for 
AB,=B,A=I and AB,=B,A=I implies B; = Bil = B,(AB2) = (BiA)B2 = IB2 = By 


We call such a matrix B the inverse of A and denote it by A~!. Observe that the above 
relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B. 


PAS By ts 6—5 —10+ 10 EO 
E 1 14: = =, 
ee? e ite (igang —5 +6 ) & - 
38 —5 Be WS a yas alts a" Lo 0 
—1 2/\1 3/  ‘\-24+2°-5+6 <, . 1 
5s 3 =5 
Thus TH 3 and Se are invertible and are inverses of each other. 


We show (Problem 3.37) that for square matrices, AB =I if and only if BA =I; hence 


it is necessary to test only one product to determine whether two given matrices are in- 
verses, as in the next example. 


WOO Bia 2a 2 Sea a Pa (1) — Gi 0) 
Example 3.15: Phy Uns 4 0 A Sa 222 4 Si 40 3a a () 
Ae ees ail ait —44—4+48 8+0—8 8+1—8 


Thus the two matrices are invertible and are inverses of each other. 


We now calculate the inverse of a general 2 x 2 matrix A = ( 4) We seek scalars 
x, Y, 2, w such that cae 


Cr cO Nii Ns eae aa ee ag OZ oy bw \ eile 
ec dj/\z w Oil co +dze cy+dw/ \0 1 


CHAP, 3] MATRICES 45 


which reduces to solving the following two systems of linear equations in two unknowns: 


OTe ah ay+bw = 0 
cx +dz = 0 cy+dw=1 


If we let |A| = ad— bce, then by Problem 2.27, page 33, the above systems have solutions if 
and only if |A|~0; such solutions are unique and are as follows: 


2 ag SL Re ee Pee ON esta ah a ene Ebel eeeene 
ad—be  |A|’ ad—be  |Al|’ 24d = be. > # Al’ Fh Od Dee al At 
i > d/|A| —b/|A| 1 d —b 
Accordingly, Aston = — 
or re a/|A| |A|\—e a 


Remark: The reader no doubt recognizes |A| = ad—be as the determinant of the matrix 
A; thus we see that a 2 x 2 matrix has an inverse if and only if its determinant 
is not zero. This relationship, which holds true in general, will be further 
investigated in Chapter 9 on determinants. 


BLOCK MATRICES 


Using a system of horizontal and vertical lines, we can partition a matrix A into smaller 
matrices called blocks (or: cells) of A. The matrix A is then called a block matrix. Clearly, 
a given matrix may be divided into blocks in different ways; for example, 


OOO eae 88 Mea ale Orrdy ies eres We Be 
| i aeunehet 2? 

cag 2 fie 2 8 BF 2 Se SO ee ue | 7 2 

Rates ati eg nS ae 

Soeur te. eG Ciara healer ban eh ge pare GON a eet 


The convenience of the partition into blocks is that the result of operations on block matrices 
can be obtained by carrying out the computation with the blocks, just as if they were the 
actual elements of the matrices. This is illustrated below. 


Suppose A is partitioned into blocks; say 


Au Ai ace Ain 
A aN Aoi Acs Aon 
Ami Am2 Amn 


Multiplying each block by a scalar k, multiplies each element of A by k; thus 


kKAw: KAiz ,.. KAse 
wits kAon kA ... kAon 
kAm kAme kAmn 


Buy By oes Bi, 


Tasmauia: ousewle! jocks, ¢ [eee ye cee 


46 MATRICES [CHAP. 3 


Furthermore, suppose the corresponding blocks of A and B have the same size. Adding 
these corresponding blocks, adds the corresponding elements of A and B. Accordingly, 


AutBu Awt+ Bry AGB Ge 
AVM Rute Aon + Bo Ao + Boo Aerie ce és He 
Amt a Bm Ame aR Bye cee Amn 3 oan 


The case of matrix multiplication is less obvious but still true. That is, suppose matrices 
U and V are partitioned into blocks as follows 


Uig las “te Ua Vie Vaart? aaa 
Tee U2 Un Uap Sal ee Va V2 Von 
Uma Ume Use Voi Vo2 Von 


such that the number of columns of each block Ux is equal to the number of rows of each 
block V;,;. Then 


Wu Wr Win 
ive Wa Wo Won 
Wit. Wm Won 
where Wii= Ua UnVay Ft UVa; 


The proof of the above formula for UV is straightforward, but detailed and lengthy. It 
is left as a supplementary problem (Problem 3.68). 


Solved Problems 


MATRIX ADDITION AND SCALAR MULTIPLICATION 
3.1. Compute: 


Ue @ 3-5 6-1 
(i) + 
0-5 1-1 2 0-2-8 


Pe Inshore Re in i eee 
m) G 4 AG ay -8(4 5 s) 


(i) Add corresponding entries: 
e = . 4 G =, zy) 
) =a all OMe ia 2 3 
.: (ite yA bY = Bae 6 Ae Al ON nS 
Sel) ohh bi es Ouaee teers ame ig é =5 21 sa) 
(ii) The sum is not defined since the matrices have different shapes. 


(iii) Multiply each entry in the matrix by the scalar —3: 


tS -3 -6 9 
4-5 a ae 15 a) 


CHAP. 3] MATRICES 47 


eo et eae ee pe Se 
ow, Let 4 = ( ) B=| jro={ “{)- Fina 3A + 4B —2¢. 


3.3. 


3.4. - 


0 —4 Oral *5 a het h 
First perform the scalar multiplication, and then the matrix addition: 
SAB oc (3 Td 5) + fs —8 —12 ye On 2a 105 —25  -—5 
9 UW) Sly OD! Seb) Ph) Se EPS? (a 2a, 
: ; x MY 6 4 a 
Find x,y,z and wif 3 sf + 4 
ZW —1 2w z2+w 3 ; 
First write each side as a single matrix: 
Ge 3y 2y Ag ap al AD ral ae e 
32 38h ea z+w-1 2w+s3 ) 
Set corresponding entries equal to each other to obtain the system of four equations, 
30) — ot 4 ta 
3y= = aty +6 24y = 6+¢a4 
or 
32. =" 2 w — 1 2 = 
Bp 3" 400) ae WwW = 3 


PUHesSOlMtONSIS ae — 29 147s 3. 


Prove Theorem 3.1(v): Let A and B be mxXvn matrices and k a scalar. Then 
k(A+B) = kA + kB. 


Suppose A = (a,;) and B= (b;,). Then a;;+;; is the ij-entry of A+B, and so k(a;;+ 6;;) 
is the 7j-entry of K(A + B). On the other hand, ka;; and kb;; are the ij-entries of kA and kB respec- 
tively and so ka,;+ kb,; is the ij-entry of kA +kB. But k, a,; and 6;; are scalars in a field; hence 


k(a,;+ 6) = ka, + kb;;, for every i, 7 
Thus k(A+B) = kA + kB, as corresponding entries are equal. 


Remark: Observe the similarity of this proof and the proof of Theorem 1.1(v) in Problem 1.6, page 
7. In fact, all other sections in the above theorem are proven in the same way as the 


corresponding sections of Theorem 1.1. 


MATRIX MULTIPLICATION 


3.5. 


Let (7 X s) denote a matrix with shape r x s. Find the shape of the following products 
if the product is defined: 

(i) (2x 3)(3 x 4) (iii) (1 x 2)(8 x 1) (v) (3x 4)(8 x 4) 

(ii) (4x 1)(1 x 2) (iv) (5 X 2)(2 x 8) (vi) (2 x 2)(2 x 4) 


Recall that an m X p matrix and a q Xn matrix are multipliable only when p= 4q, and then 
the product is an m X n matrix. Thus each of the above products is defined if the “inner” numbers 
are equal, and then the product will have the shape of the “outer” numbers in the given order. 


(i) The product is a 2 x 4 matrix. 


(ii) The product is a 4 X 2 matrix. 


(iii) The product is not defined since the inner numbers 2 and 3 are not equal. 


(iv) The product is a 5 x 3 matrix. 
(v) The product is not defined even though the matrices have the same shape. 


(vi) The product is a 2 Xx 4 matrix. 


48 MATRICES [CHAP. 3 


Pees 


pei k 


3.6. Let 4 Rw ae 


4 
and B= (5 : ). Find (i) AB, (ii) BA. 


(i) Since A is 2X 2 and B is 2 X 3, the product AB is defined and is a 2 X 3 matrix. To obtain the 
2 0 
entries in the first row of AB, multiply the first row (1, 3) of A by the columns () ‘1 (an 


—4 
and ( - of B, respectively: 


atts OF=46 sf ise te —=6 o 


To obtain the entries in the second row of AB, multiply the second row (2, —1) of A by the 
columns of B, respectively: 


Rees: 1-0 + 3° (—2) a See 


1 3\/8 96 aay | 11 —6 14 
(operas )( -2 ) ed aeons 2°0 + (—1)+(—2) 2 Sea 
11.6 44 
Thus AB .= e : ee 


(ii) Note that B is 2x38 and A is 2X 2. Since the inner numbers 3 and 2 are not equal, the product 
BA is not defined. 


re 


ots i Av (251 d B= 
3.7. Given (2,1) an Q eee 


), find (i) AB, (ii) BA. 


(i) Since A is 1 X 2 and B is 2 X38, the product AB is defined and is a 1 X 8 matrix, i.e. a row 
vector with 3 components. To obtain the components of AB, multiply the row of A by each 
column of B: 


ape (Bin) (| —  ) = (2°15 4, 2° (2) 6) 20 “(se =e 


(ii) Note that B is 2X3 and A is 1 X 2. Since the inner numbers 3 and 1 are not equal, the product 
BA is not defined. 


7 
a eny Given: “Av =rie le 10). wand 2 = . a <dind) (Gj), AB GineBAs 
it Br Aer 


(i) Since A is 3 X 2 and B is 2 X 3, the product AB is defined and is a 3 X 3 matrix. To obtain the 
first row of AB, multiply the first row of A by each column of B, respectively: 


2 -1 ae Oe, eee enone’ 
“nr (a ng z F 4-4 -10+0 < 1 -8 -10 
3. 4 |< aoe oe 


To obtain the second row of AB, multiply the second row of A by each column of B, 


respectively: 
Py 3 = — = as Sg} 
otal _ Peg! Mee 1 8 10 1 8 10 
1 J ) = a0) 2225 0." 5-0 = Le 2a 
3 4 38 4 0 


To obtain the third row of AB, multiply the third row of A by each column of B, respectively: 


CHAP. 3} MATRICES 49 


anes | ra x es 
pep 1 -2 = 1 8 10 SS eS eer 
a4 = 1 —2 235 = LAO: ene 
—3 4 

—34+12 64+16 15+0 6. 90 15 

—1 -8 —10 

Thus Ape 1 Bort 

Os OO. GALS 


(ii) Since B is 2X 8 and A is 3 X 2, the product BA is defined and is a 2X 2 matrix. To obtain the 
first row of BA, multiply the first row of B by each column of A, respectively: 


1 2 a - 
Seah ere 


To obtain the second row of BA, multiply the second row of B by each column of A, respectively: 


as Cres sea as (* e) 


m © 


2 St 
¢ —2 =) a is 15 —21 2 sf15. 21 
ponte: /\ ba GEE AO 3.0 10) iain 1 ok O peices 
15 —21 
Thus = 
ci e Es 


Remark: Observe that in this case both AB and BA are defined, but they are not equal; in fact they 
do not even have the same shape. 


fo—4 5 Oe all 

De im 
3.9. Let a= (7 pa and B= {2-1 3 -1 
4 0-2 0 


(i) Determine the shape of AB. (ii) Let cj denote the element in the ith row and 
jth column of the product matrix AB, that is, AB = (cj). Find: ¢2s, cia and ¢21. 


(i) Since A is 2 X 3 and B is 3 X 4, the product AB is a 2 X 4 matrix. 
(ii) Now ¢;; is defined as the product of the ith row of A by the jth column of B. Hence: 


0 
Pee Oe 3) a oe reas SI) oe Os Bee (8) 2( 2) 5 On 0h Go NG 
—2 
1 
Cab Ot) = Bebe (1) (pt 0 0 = 32 te 1 cba0 a8 
0 
1 
2) hae (1, 0, —3) 2 = Theil Oe 2 ate (3) 4. == fl Oe — 19) = 11 
4 


1 6 1 
3.10. Compute: (i) i at Es (iii) iG : (v) (2, -1) bs 


o(3 (4) (aa 


(i) The first factor is 2 x 2 and the second is 2 X 2, so the product is defined and is a 2X2 matrix: 


50 


3.11. 


3.12. 


MATRICES [CHAP. 3 


1 6N 74 0 # 1°4+6°2 desi Oe Gxer(== 1) % le BS 
& fe 2) ye aes ier ie (=8) + 0 -£:5 »(—1) ee eae) 
(ii) The first factor is 2 X 2 and the second is 2 X 1, so the product is defined and is a 2 X 1 matrix: 
eG 2 SE Lei? 6 ssi(a) a —40 
—3 ite te (ana +5°(-7)/ \-4l 
(iii) Now the first factor is 2X1 and the second is 2 X 2. Since the inner numbers 1 and 2 are 
distinct, the product is not defined. 
(iv) Here the first factor is 2 X 1 and the second is 1 X 2, so the product is defined and is a 2X 2 
trix: 
ee a 1 a oy’ Pe ae ee tae 
pe a ee fay patna eh ea op ac be 
(v) The first factor is 1 x 2 and the second is 2 X 1, so the product is defined and is a 1 X 1 matrix 
which we frequently write as a scalar. 


@,-1)(_5) = (21+ (-1)*(-6) = (@) = 8 
Prove Theorem 3.2(i): (AB)C = A(BC). 


Let A = (a;;), B = (bj) and C = (c&)). Furthermore, let AB = S = (s;,) and BC = T = (t,). 
Then 


m 
Six = Aiydy, + aigdbo, + °°* + Aimbme = = Aijd jx 
I= 
n 
i = bj1¢11 Te Distal ute Dea — = Dink 


Now multiplying S by C, i.e. (AB) by C, the element in the ith row and lth column of the matrix 
(AB)C is 


Mea 


n m 
8j1C1, + Sigg) +o °* + Sing = = SikCkl = = S (jd jx) CK1 
= k=1 j=1 


1 


On the other hand, multiplying A by T, i.e. A by BC, the element in the 7th row and /th column 
of the matrix A(BC) is 
m n 
Gigly, T Ajyto, + +** + Aimtmy = 2 Ayjty = S 44j(0j.CK0) 
i= 


k= 


ihas 


_ 


K) 


Since the above sums are equal, the theorem is proven. 


Prove Theorem 3.2(ii): A(B+C)=AB+ AC. 


Let A = (a,;), B = (by) and C= (cj). Furthermore, let D=B+C = (dy), E = AB = (ex) 
and F = AC = (fx). Then 


d; = Dix + Cik 
™m 
Cig = 4d, + Agden + °° + Gimbme = 2 1550 jx 
I= 
™m™ 
fix = MrCr~e + Gig, + °° + Gimeme = 3 ajejy 
j=1 


Hence the element in the ith row and kth column of the matrix AB + AC is 
m m m 
ex + fx = = Aijdj~e + = AjyjCy = = ij(D jx, + Cir) 
— ‘bos ai 


On the other hand, the element in the ith row and kth column of the matrix AD = A(B+C) is 
m m 
Wid yj, + Ajgdo, + +++ + Gimdme = 2 ad = = G4j(D 5, + Cjx) 
j= j= 


Thus A(B+C) = AB+ AC since the corresponding elements are equal. 


CHAP. 3] MATRICES 51 


TRANSPOSE i ieeiteila 
3.13. Find the transpose A‘ of the matrix A =/2 3 4 5). 
Ae AGeA ; 

V2 

Rewrite the rows of A as the columns of At: At = : : ; 

054 


3.14. Let A be an arbitrary matrix. Under what conditions is the product AA‘ defined? 


Suppose A is an m Xn matrix; then At is nXm. Thus the product AAt is always defined. 
Observe that AtA is also defined. Here AAt is an mXm matrix, whereas AtA is an n Xn matrix. 


fe a2 0 
3.15. Let A = Ric 1) Find (i) AA‘, (ii) A‘A. 
i 3 
To obtain At, rewrite the rows of A as columns: At = |2 -—1]. Then 
0 4 
Diy Nr eee HL ga ne 
a @ —1 a Baie 
0 4 
ve ZO 10 1i3)-5 2 (—1)) 04 WA 5 iL 
3°1+ (-1)°-24+4-0 3°3+(- any en) 2 G i) 
3 
AtA. = -1 5 O 
4 Al 4 
1°14+3°3 1°2+3-+(-1) 1°0+3°4 10 1.2 12 
SF Als $41) 9, 20 Fa (1) (1), 20 Ried So ee ebl 4 
(ol ae abo a Od rae) WOW se Abo os 12 -—4 16 


3.16. Prove Theorem 3.3(iv): (AB)' = B‘At. 
Let A = (aj) and B= (b;,). Then the element in the ith row and jth column of the matrix 
AB is 
Qj101; + Gigdgs + °° + Gimbm; (1) 
Thus (1) is the element which appears in the jth row and ith column of the transpose matrix (AB)'. 
On the other hand, the jth row of Bt consists of the elements from the jth column of B: 
(63; bo; .-- Om) (2) 


Furthermore, the ith column of A‘ consists of the elements from the ith row of A: 


ait 
Aig 


(3) 


Qim 


Consequently, the element appearing in the jth row and ith column of the matrix BtA‘ is the 
product of (2) by (3) which gives (1). Thus (AB)‘ = BtAt, 


52 MATRICES (CHAP. 3 


ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 


3.17. Circle the distinguished elements in each of the following echelon matrices. Which 
are row reduced echelon matrices? 


1G Zeon Oh er Oe eT one we Le Oss So 0 ee 
Oe O2 Ot eae 0: 0) 0) (OUR 12hs a0 ral aries asta 
O05 FORT Se 0.505 2ON DONA 0: >. O90 aie 


The distinguished elements are the first nonzero entries in the rows; hence 


DRE SA ol 0@ 7-5 0 G)-0Nag 5% cdaaee 
0 0 @) 2-4], 0 20°00 YG) eh OGG e 2a nome 
Oma O erOen Tans 0. LO 0. SOP 0 OF. "OL On uy 


An echelon matrix is row reduced if its distinguished elements are each 1 and are the only nonzero 
entries in their respective columns. Thus the second and third matrices are row reduced, but the 


first is not. 
ip —2° 33-4 
3.18. Given A = |2 -1 2 2]. ~~ (i) Reduce A to echelon form. (ii) Reduce A to row 
Ree Me Fie 43 


canonical form, i.e. to row reduced echelon form. 


(i) Apply the operations R, > —2R,+R, and R; > —3R,+R3, and then the operation 
R; > —TR, + 8R3 to reduce A to echelon form: 


dba Bae Bsa iy Sa eee | 
Aten to 0 38-4 4 to 0 Sua 4 
0 (OE 6 0 0 CS 


ti) Method 1. Apply the operation R, > 2R,+ 8R,, and then the operations R, ~ —R,+ 7R, 
and R, > 4R3+ 7R, to the last matrix in (i) to further reduce A: 


3 0 1 5 21 0 0 45 
to 0 38-4 4 to UP aval ORS T2 
0 0 Cay 0 0 (10 


Finally, multiply R, by 1/21, R, by 1/21 and R, by 1/7 to obtain the row canonical form of A: 


1 0 0 15/7 
0 1 Oey 
0 0 OL 


Method 2. In the last matrix in (i), multiply R, by 1/3 and Rs by 1/7 to obtain an echelon 
matrix where the distinguished elements are each 1: 


1 
0 4/3 4/3 
0 


Now apply the operation Rk, > 2R,+ R,, and then the operations R, > (4/3)R3 + R, and 
Rk, > (—1/3)R3 + R, to obtain the above row canonical form of A. 


Remark: Observe that one advantage of the first method is that fractions did not appear 
until the very last step. 


CHAP. 3] 


MATRICES 53 
0.1. 8--2 
3.19. Determine the row canonical FOL sOt Am Oe Sle 
Oa Ove sel 
9 —4 8 2 —4 8 Dhl ede s 
A to 0 3 =2 to 0 oa to Ue he Be eo 
2 a, 0 6 —4 07 500.0 
D0) 7 aa 1 0 —7/2 ~-5/2 
to O18 =p to 0--1 oF ao 
Oy S070" 220 OraLOmiss0 0 


Note that the third matrix is already in echelon form. 


3.20. 


Reduce A = 


6 
—4 
1 


3-4 
Lb 
2*—o 


to echelon form, and then to row reduced echelon form, 


i.e. to its row canonical form. 


first 


The computations are usually simpler if the “pivotal” element is 1. Hence first interchange the 


and third rows: 


| 
n 
=) to 


—26/9 to 


0 0 


Note that the third matrix is already in echelon form. 


3.21. 


Show that each of the following elementary row operations has 


an inverse operation 


of the same type. 


[#1] 


SQUARE 


3.22. Let 


(i) 


Interchange the ith row and the jth row: Rio Rj. 

Multiply the ith row by a nonzero scalar k: Ri> kRi, k ~ 0. 

Replace the ith row by k times the jth row plus the ith row: Ri > kh; + Ri. 
Interchanging the same two rows twice, we obtain the original matrix; that is, this operation 


is its own inverse. 


Multiplying the ith row by k and then by k~1, or by k—! and then by k, we obtain the original 
matrix. In other words, the operations R;> kR; and R;—> k—1R; are inverses. 


Applying the operation R, > kR; +R, and then the operation R; > —kR;+ Rj, or apply- 
ing the operation R; ~ —kR;+ R,; and then the operation R, > kR; + R,, we obtain the orig- 


inal matrix. In other words, the operations R; > kR;+R; and R; > —kR;+ R, are 


inverses. 


MATRICES 


A G at Find (i) A, (ii) A%, (iii) f(A), where f(a) = 20° — 4a +5. 


2 
Ey 

1-1+2°4 1+2-+ 2+(—8) 
A4e1 + (—8)*4 4°24 (—8)+(-8) 


A2 AA 


( oe 


54 


3.23. 


3.24. 


MATRICES [CHAP. 3 


(i) A’ = AA? = e a as 
foo) ee ay 


1°9 + 2+(-8) bes an = e 
(ieee anes 4+(—4) + (—3)*17 60 —67 


(iii) To find f(A), first substitute A for « and 5J for the constant 5 in the given polynomial 
f(a) = 2u3 — 4m + 5: 


—7 30 | Ga 12.0 
= 8) fae = ae 
f(A) 2A 4A + 5] zis c=) a( Bea, :) 


Then multiply each matrix by its respective scalar: 
— —4 —-8 5 0 
o 14 60 . an ( 
120 —134 —16 12 0 5 
Lastly, add the corresponding elements in the matrices: 


13s —14—-—4+5 60810 2 & oa 
i 120 —16+ 0 —134-+-12 +5 104. 11%, 


Referring to Problem 3.22, show that A is a zero of the polynomial g(x) = 7?+2x—11. 


A is a zero of g(x) if the matrix g(A) is the zero matrix. Compute g(A) as was done for f(A), 
i.e. first substitute A for « and 11/ for the constant 11 in g(x) = ~2+ 2% —11: 


9 a 24M ee iW sO) 
ASS (Q). al 
Then multiply each matrix by the scalar preceding it: 


Pile cosa couus iy Sake 
ANS ee elute a a 0 a 


Lastly, add the corresponding elements in the matrices: 
Oe 44a) 0 0 
g\A) = = 
aot ap) | Wea 8 hil 0 0 


Since g(A) =0, A is a zero of the polynomial g(x). 


Sy et! 
Al) — AC AS el = 
g(A) afr es = 


Lane 


Given A = 
4 -3 


. Find a nonzero column vector u = such that Au = 3u. 
7) 


First set up the matrix equation Au = 3u: 


Cee) ee) 


Write each side as a single matrix (column vector): 


x + 3y ine 3a 
A a 3y 
Set corresponding elements equal to each other to obtain the system of equations (and reduce to 
echelon form): 
“a+ 3y = 3x 20 13 —10 PRG SAY) et () 
or = = 
4e — 38y = 8y 4e — by = 0 ay 0 S520 | OU RE eee 
The system reduces to one homogeneous equation in two unknowns, and so has an infinite number 
of solutions. To obtain a nonzero solution let, say, y = 2; then x =3. That is, =38,y=2 isa 
3 


solution of the system. Thus the vector u = Ay is nonzero and has the property that Au = 3u. 


CHAP. 3] 


3.25. 


MATRICES 55 


3 


Method 1. We seek scalars «x, y,z and w for which 


. a y me ie 0 3a +52 38y + 5w > © 
== or i 
Be BI NEE 0D On iaes See é .) 


Find the inverse of i 7 


or which satisfy paneer ana 3y + 5w = 0 
The solution of the first system is x = —3, 2=2, and of the second system is y =5, w = —3. 


Thus the inverse of the given matrix is ie ) , 


Method 2. We derived the general formula for the inverse A—! of the 2X2 matrix A = ie 5) : 


desab oie 
AGN = alae - where |A| = ad — be 
ey ea Me My as: 3 -5\ /-8 5 
a spider — c ie then |[A| = 9—10=~-—1 and A-! = rik | = ( 5 my 
MISCELLANEOUS PROBLEMS 
3.26. Compute AB using block multiplication, where 
Ty esa deg2ee3 | 1 
| 
EN ER onllvee and V2 ities aalec: Wiggs god S40 ke | 
iat Sais SN ce tet Melee Tae WPA ukergh s Madey = 5 Sager ps 
Omi 22 02 OOF or 

Here A = ie a and, (B= ts | where H,F,G,R,S and T are the given blocks. 

H (0) XG; We GE 

ence 
( 8) = i 6 9 12 15 4 
Ae te es nee = 19 26 33/ \7 0 Lely gt ag oa Star 
0 GT 
(0° 0 -*0) (2) OM OR ay Olen 
3.27. Suppose B= (Ri, Ro, ..., Rn), ie. that Ri is the ith row of B. Suppose BA is de- 

fined. Show that BA = (RiA, R2A,..., RnA), ie. that RA is the ith row of BA. 

Let A1l, A2,...,A™ denote the columns of A. By definition of matrix multiplication, the ith row 
of BA is (R;*A1, R;*A2, ...,2;*A™). But by matrix multiplication, R,A = (R,*A!, Rj*A?,..., 
R,-A™). Thus the ith row of BA is R,A. 

3.28. Let e=(0,...,1,...,0) be the row vector with 1 in the ith position and 0 else- 
where. Show that eA = Ri, the ith row of A. 

Observe that e; is the ith row of J, the identity matrix. By the preceding problem, the ith row 

of IA is eA. But JA =A. Accordingly, eA — Rj, the ith row of A. 
3.29. Show: (i) If A has a zero row, then AB has a zero row. 


(ii) If B has a zero column, then AB has a zero column. 
(iii) Any matrix with a zero row or a zero column is not invertible. 


(i) Let R,; be the zero row of A, and B},...,B" the columns of B. Then the ith row of AB is 
(R;° B}, R,° B2, 0 89 R,° B”) = (0, 0, So a) 0) 


56 MATRICES [CHAP. 3 


(ii) Let C; be the zero column of B, and A;,...,A,, the rows of A. Then the jth column of AB is 


A,°C; 0 
Ag°C; wis 0 


(iii) A matrix A is invertible means that there exists a matrix A~1 such that AA a! =Aa Ar —als 
But the identity matrix J has no zero row or zero column; hence by (i) and (ii) A cannot have 
a zero row or a zero column. In other words, a matrix with a zero row or a zero column cannot 
be invertible. 


3.30. Let A and B be invertible matrices (of the same order). Show that the product 
AB is also invertible and (AB)~! = B-!A~!. Thus by induction, (A1A2---An)~* = 
An -++Az;'A:* where the A; are invertible. 


(AB)(B-1A=1)' (= A(BB-))AT1 SATA! (= AAe 
and (B-1A-1(AB) = B-\(A-1A)B = B-1B B-1B = 


Thus (AB)—1 = B-1A51, 


3.31. Let wu and v be distinct vectors. Show that, for each scalar k EK, the vectors 
u+k(u—v) are distinct. 


It suffices to show that if 
Wap ICs OD) “=  ahese by) (1) 


then k, =k». Suppose (1) holds. Then 
FO=—O) = (3@=o) or (kj —k.)(u—v) = 0 


Since wu and v are distinct, w-v #0. Hence k,—k,=0 and k, = ky. 


ELEMENTARY MATRICES AND APPLICATIONS* 


3.32. A matrix obtained from the identity matrix by a single elementary row operation is 
called an elementary matrix. Determine the 3-square elementary matrices corre- 
sponding to the operations Ri<— Ro, Rs>—TR3 and R2> —3R:i + Ro. 


ily Oe hy 
Apply the operations to the identity matrix I, = |0 1 0] to obtain 
Oe Oper! 
0 12 00 Li Oe 0 
1G = | a ee od Wb SO oy Tee ee lB TL 
Oe O sare Oy ay Oc Ona 


3.33. Prove: Let e be an elementary row operation and EF the corresponding m-square elemen- 
tary matrix, ie. & = e([m). Then for any m Xn matrix A, e(A)= EA. That is, the re- 
sult e(A) of applying the operation e on the matrix A can be obtained by multiplying 
A by the corresponding elementary matrix LH. 


Let RF; be the 7th row of A; we denote this by writing A = (R,,...,R,,). By Problem Baths abe 
B is a matrix for which AB is defined, then AB = (R,B,...,R,,B). We also let 


at 
Cf == (Opes 2 Onl Omen O)e IN =) 


*This section is rather detailed and may be omitted in a first reading... It is not needed except for certain 
results in Chapter 9 on determinants. 2 


CHAP. 3} MATRICES 57 


3.34. 


3.35. 


3.36. 


Bets I) means that 1 is the ith component. By Problem 3.28, eA = R; We also remark that 
I= (e,...,@m) is the identity matrix. 


(i) Let e be the elementary row operation Rk; <> R;. Then, for A =i and A=j, 


A 
Heer Paap meee, Ril er co) 
ZX UN 
and e(A) — (R,, sa ey R;, cr] R,, alot ito) 
Thus 
IN za Ww TX 
Beal (CTA Nee ype Aas CA 120s, CA) =e Nght ehRty aa RAs aR a) = eC) 


(ii) Now let e be the elementary row operation R,> kk,, kK~0. Then, for A =1, 


“N YN 
Bas 6) eee, hei. 1d, Ope One a eA): =i ek eke (eer e) 
ZS VAN 
Thus BAM eA he Asin, én Almas “(igen (eRe ten pre aye <a eC) 


(iii) Lastly, let e be the elementary row operation R; > kR;+ R;. Then, for ~ =, 
a pe ei 
Eo = e) = (ce, .:., ke; +e, +. 5, em) and (A) = (Ris... ky + Ry 22.5) 
Using (ke; + eA = k(e,A) + eA = kR; +R, we have 
eee a 
AN Ney at (he; eA. : mA) = (Ry v.'3; KR + Re, ey) = ce(A) 


Thus we have proven the theorem. 


_Show that A is row equivalent to B if and only if there exist elementary matrices 


E,,...,E; such that E,:--H2HiA = B. 


By definition, A is row equivalent to B if there exist elementary row operations ¢,,...,¢e, for 
which e,(-+-(e9(e,(A)))-:-) = B. But, by the preceding problem, the above holds if and only if 
E,:::E,H,A = B where E; is the elementary matrix corresponding to ¢;. 


Show that the elementary matrices are invertible and that their inverses are also 


elementary matrices. 
Let EF be the elementary matrix corresponding to the elementary row operation e: e(1) = E. 
Let e’ be the inverse operation of e (see Problem 3.21) and E’ its corresponding elementary matrix. 


Then, by Problem 3.33, 
Le sel(e(D)) el and Tl —e(e)) a cH! 


Therefore E’ is the inverse of EF. 


Prove that the following are equivalent: 

(i) A is invertible. 

(ii) A is row equivalent to the identity matrix I. 
(iii) A is a product of elementary matrices. 


Suppose A is invertible and suppose A is row equivalent to the row reduced echelon matrix B. 
Then there exist elementary matrices E,,E,...,H, such that E,--:E,H,A = B. Since A is invert- 
ible and each elementary matrix E; is invertible, the product is invertible. But if B#I, then B 
has a zero row (Problem 3.47); hence B is not invertible (Problem 3.29). Thus B=TJI. In other 


words, (i) implies (ii). 
Now if (ii) holds, then there exist elementary matrices E,,H5,...,H#, such that 
Ey«--E,E,A =I, andso A =7({B,:-*H,B,)~? = Bale Be 


By the preceding problem, the E;* are also elementary matrices. Thus (ii) implies (iii). 
Now if (iii) holds (A = £,E ...Es), then (i) must follow since the product of invertible 


matrices is invertible. 


58 MATRICES [CHAP. 3 


3.37. Let A and B be square matrices of the same order. Show that if AB=TI, then 
B>=A-". Thus AB =f if*and only if, -BA.—7: 

Suppose A is not invertible. Then A is not row equivalent to the identity matrix J, and so A 
is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices 
E,,...,E, such that E,:+-E,E,A has a zero row. Hence E,:::E,H,AB has a zero row. Accordingly, 
AB is row equivalent to a matrix with a zero row and so is not row equivalent to J. But this con- 
tradicts the fact that AB =I. Thus A is invertible. Consequently, 


B= TBs =1A-TA)B oS ASMAB) = Ast 


3.38. Suppose A is invertible and, say, it is row reducible to the identity matrix J by the 
sequence of elementary operations ¢1,...,én. (i) Show that this sequence of elemen- 
tary row operations applied to J yields A~1. (ii) Use this result to obtain the inverse 


LO ee 
Of Ara hee at 8 
4% Spe S 


(i) Let EH; be the elementary matrix corresponding to the operation e;, Then, by hypothesis and 
Problem 3.34, E,:::H,H,A =I. Thus (E,:::HoH;I)A =I and hence A-1=E,:--EH Eyl. 
In other words, A—! can be obtained from I by applying the elementary row operations ¢;,..., pn. 


(ii) Form the block matrix (A, J) and row reduce it to row canonical form: 


We SO sk, 2 RS One 6 LS S.0 5 Sie TO Ore anO 
eS Ars Se | VSR Miner eS ae Mae Cat beac Pe er) (2 
Teena eect ag USN BR Cina 0 A RO ade Oa ict 
ea ae eisveres! Us ieee Oo A fal 
toi], 01 Gare: TNO: to 9. 1is 2,0 as AO ie 
ae es She ae Mae | 08! OSS ha 6" Sake 
TY 06, ) gO8 Sad Me eens 
to's Oo 0 Teams SON eT 
Oi g0e Panam at ey 


Observe that the final block matrix is in the form (J, B). Hence A is invertible and B is its 


inverse: 
—11 2 2 
Alaa a! 0 1 
(Se Sl 


Remark: In case the final block matrix is not of the form (I, B), then the given matrix is not 
row equivalent to J and so is not invertible. 


Supplementary Problems 


MATRIX OPERATIONS 
In Problems 3.39-3.41, let 


SO 88 1)? haces 3) & ae Bde tas || 
ENON DAL AS 3 


3.39. Find: (i) A+B, (ii) A+C, (iii) 3A — 4B. 
3.40. Find: (i) AB, (ii) AC, (iii) AD, (iv) BC, (v) BD, (vi) CD. 


3.41. Find: (i) A‘, (ii) A'C, (iii) DtAt, (iv) BtA, (v) DtD, (vi) DDt. 


CHAP, 3] MATRICES 59 


3.42. Let ar a Ne 
42. ere; = (1,050); -e,—= (0, 1,0) and e, = (0,,0;1)2-) "Given. “A241 B,- babs 8 fi i 
(ii) 2A, (iii) eA. ; Se ee 


Cy Co C3 C4 
3.43... Let ¢; = (0,...., 0,1, 0,..., 0) where 1 is the ith component. Show the following: 
(i) Be; =C;, the jth column of B. (By Problem 3.28, eA = R,.) 
(ii) If e,A =e,B for eachi, then A = B. 
(iii) If Ae{= Be! for each i, then A = B. 


ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 


3.44. Reduce A to echelon form and then to its row canonical form, where 


ROSS nate ee eS a tee 
(es AN? hi And D8 (i) AN GR ee OMB 
Sa 6a 6 8,76 45 65557 


3.45. Reduce A to echelon form and then to its row canonical form, where 


(es Et 8 at (ole ae 
piteb <8 Os a1 aca 
1 A = ‘ il Aa 
(i) ae aa (ii) Om Or men ST 

oe hen meee Hib aA 


3.46. Describe all the possible 2 X 2 matrices which are in row reduced echelon form. 


3.47. Suppose A is a square row reduced echelon matrix. Show that if A ~TJI, the identity matrix, then 
A has a zero row. 


3.48. Show that every square echelon matrix is upper triangular, but not vice versa. 


3.49. Show that row equivalence is an equivalence relation: 
(i) A is row equivalent to A; 
(ii) A row equivalent to B implies B row equivalent to A; 


(iii) A row equivalent to B and B row equivalent to C implies A row equivalent to C. 


SQUARE MATRICES 


3.50. Let A = : . (i) Find A? and A’. (ii) If f(a) = #3 — 8x2— 2¢+ 4, find f(A). (iii) If 


g(x) = «2 — a — 8, find g(A). 


5 


3.51. Let B= eB :) . (i) If f(x) = 2a2 — 4e + 8, find f(B). (ii) If g(x) = #2 — 4% — 12, find g(B). 
x 
(iii) Find a nonzero column vector u = Ka) such that Bu = 6u. 


. 3 “x Y a 
3.52. Matrices A and B are said to commute if AB =BA. Find all matrices ie 2) which commute 


mee 
ith F 
wit Ga » 


3.53. Let A = G 2) . Find A”. 


60 


3.54, 


3.55. 


3.56. 


MATRICES [CHAP. 3 


2 0 7 0 
Let A =( ) and B= 3 
OF3 0 11 


Find: (i) A+B, (ii) AB, (iii) A2 and A3, (iv) A”, (v) f(A) for a polynomial f(a). 


c, dy 
Tae oe Vn eg ner Frenne Ce irb day lees) Jeb) 
0 3 by bo a. by, Oe eects il h 
Cn dy 


Suppose the 2-square matrix B commutes with every 2-square matrix A, i. AB = BA. Show that 


k 0 
Bi Cj for some scalar k, i.e. B is a scalar matrix. 


Let D,, be the m-square scalar matrix with diagonal elements k. Show that: 


(i) for any m X n matrix A, D,A = kA; (ii) for any n X m matrix B, BD, = kB. 


Show that the sum, product and scalar multiple of: 
(i) upper triangular matrices is upper triangular; 
(ii} lower triangular matrices is lower triangular; 
(iii) diagonal matrices is diagonal; 

(iv) scalar matrices is scalar. 


INVERTIBLE MATRICES 


3.59. 


3.60. 


3.61. 


3.62. 


3.65. 


Bk 2—- 
Find the inverse of each matrix: (i) ( ) Smeal) ( 5) 3 


a ae 
=O 8 Py Al ==] 
Find the inverse of each matrix: (i) Pag ale Ie. KCU)) CAMO warAen at 
A bs By) 8 
lL gs aA 
Find the inverse of 3 Sil  € 
aks Gi a 


Show that the operations of inverse and transpose commute; that is, (At)—!1=(A7—1)t. Thus, in 
particular, A is invertible if and only if At is invertible. 


Cj Oe eet 0) 
When is a diagonal matrix A oe a ie ie invertible, and what is its inverse? 
0 0 a 


Show that A is row equivalent to B if and only if there exists an invertible matrix P such that 
B=PA. 


Show that A is invertible if and only if the system AX =0 has only the zero solution. 


MISCELLANEOUS PROBLEMS 


3.66. 


3.67. 


Prove Theorem 3.2: (iii) (B + C)A = BA + CA; (iv) k(AB) = (kA)B = A(kB), where k is a scalar. 
(Parts (i) and (ii) were proven in Problem 3.11 and 3.12.) 


Prove Theorem 3.3: (i) (A+ B)§ = At+ Bt; (ii) (At=A; (iii) (kKA)*=KkAt, for k a scalar. 
(Part (iv) was proven in Problem 3.16.) 


Suppose A = (A;,) and B = (B,;) are block matrices for which AB is defined and the number of 
columns of each block A, is equal to the number of rows of each block B,;. Show that AB = (Ci;) 
where Ci; = > Ai,By,;- 

k 


CHAP. 3] 


3.69. 


3.70. 


3.71. 


3.41. 


3.42. 


3.44. 


MATRICES 61 


The following operations are called elementary column operations: 

[Z,]: Interchange the ith column and the jth column. 

[EZ]: Multiply the ith column by a nonzero scalar k. 

[£3]: Replace the ith column by k times the jth column plus the ith column. 
Show that each of the operations has an inverse operation of the same type. 


A matrix A is said to be equivalent to a matrix B if B can be obtained from A by a finite sequence 
of operations, each being an elementary row or column operation. Show that matrix equivalence is 
an equivalence relation. 


Show that two consistent systems of linear equations have the same solution set if and only if their 
augmented matrices are row equivalent. (We assume that zero rows are added so that both aug- 
mented matrices have the same number of Yows.) 


Answers to Supplementary Problems 


« ( Jiri a Gienot denned tae in) ( eee 


316} BB) 3) 
lee hey 


(i) Not defined. 


We Leia tee! Aen 
(GRE lia (ii) Not defined. (iii) (9,9) (iv) W) =O) 3 By aE (ial) i eS 
2 4 S833.) UWA se aly On Smo 


(i) (a4, ao, a3; a4) (ii) (b, bo, bs, b4) (iii) (cy, Co, C3, C4) 


1 ee yO PN eenOpes A/S 
(i) OF a0, 82:6) . 4 and Obs 1 30 0 
(0.620 6.21 Og 0 a 00k “lye=1/6 
Si oO bat id Os aA 5/11 13/11 
(ii) | 0-11 10 —15 and Lo 10/15/11. = 5/11 
On hO <0: S07. 0 0=-0 0 0 0 
ema teeta Damen 106 4/118 513/14 
is) O45 5/11, of 
(i) Oi tlvc—5. 8 a : : f 
0. 20 0 0 
OL Ow 0 0 Orr O 0 0 
Sige a) ied Bs Saas OP edie Oe 10 
a 0.0 13 11 d Oire Otess re) io 
(ii) an 0 0 0 1 
Ore 08 2002585 
OF 2-00 0s: °0 Mare Ure t-02020 


62 MATRICES [CHAP. 3 
0 140 1K 
0 >)? (3 1) or 5 o where k is any scalar. 


Opal el! 
3.48. (Q) al : is upper triangular but not an echelon matrix. 
001 


10 2 26 18 = 8 0 0 
3.50. (i) A? = ( d a wale fe ie (ii) f(A) = ce ae (iii) g(A) = i" 4) 
31 3 3k 
3.51. (i) #(B) = is i) (i) g(B) = cS a (ii) u = e or Gee k #0. 


b ihe 
3.52. Only matrices of the form ( is commute with ( 0 , i 


yay ee Ma 
Onet 


inc SURG ies FAO FS P50 = [Fie © 
3.54. (i) A+B=(j a (iii) a= (4 9) 3 =(4 4 (v) f(A) = 0 nay 
Pane a Eiken ae 
(ii) AB =( 0 oe (iv) A le ei 
3c, 3d, 
3a, 3a, ots \ee = [53¢5 3d an 
3.55. (i) DA = ae 3b, a 3A (is BES pie Sere 
3c, 38d, 


aes We 
3.59. (i) a .) mw et ie 


—5 4 —3 SS ail =e 
3.60 Ch ai) 4 (Gh) fs) he ee 
Se Ome IQ) Sal se! 


BY, ANT —alal 
3.61. 9/20 2 
=I 4 5 


3.62. Given AA~!=IJ, Then J=I'= (AA~!)§=(A7!)tAt. That is, (A—1)t = (At)-1, 


3.63. A is invertible iff each a; #0. Then A! = 2 ai 


Sos bi Ohta oh pice deren | airalihitelteieeit el. 


Chapter 4 


Vector Spaces and Subspaces 


INTRODUCTION 


In Chapter 1 we studied the concrete structures R” and C” and derived various proper- 
ties. Now certain of these properties will play the role of axioms as we define abstract 
“vector spaces” or, as it is sometimes called, “linear spaces”. In particular, the conclu- 
sions (i) through (viii) of Theorem 1.1, page 3, become axioms [Ai]-[A,], [Mi]-[M,] below. 
We will see that, in a certain sense, we get nothing new. In fact, we prove in Chapter-5 
that every vector space over R which has “finite dimension” (defined there) can be identified 
with R” for some n. 


The definition of a vector space involves an arbitrary field (see Appendix B) whose 
elements are called scalars. We adopt the following notation (unless otherwise stated or 


implied): 
) K the field of scalars, 


a,b,cork the elements of K, 
V_ the given vector space, 
u,v,w_ the elements of V. 
We remark that nothing essential is lost if the reader assumes that K is the real field R 


or the complex field C. 


Lastly, we mention that the “dot product”, and related notions such as orthogonality, 
is not considered as part of the fundamental vector space structure, but as an additional 
structure which may or may not be introduced. Such spaces shall be investigated in the 


latter part of the text. 


Definition: Let K be a given field and let V be a nonempty set with rules of addition and 
scalar multiplication which assigns to any u,v €V asum u+veEV and to 
any wE&V,kEK a product kuc€V. Then V is called a vector space over K 
(and the elements of V are called vectors) if the following axioms hold: 

[Ai]: For any vectors u,v,w © V, (uv) +weSat wr). 

[As]: There is a vector in V, denoted by 0 and called the zero vector, for which u+0=4 

for any vector we V. 
]: For each vector u € V there is a vector in V, denoted by —u, for which w+ (—u) = 0. 

A4]: For any vectors u,vEV, utv=vt+4u. 

: For any scalar k © K and any vectors u,v EV, k(u+v) 

: For any scalars a,b © K and any vector ue V, (a+b)u = au + bu. 


] ku + kv. 
3): For any scalars a,b © K and any vector uEV, (ab)u = a(bu). 


II 


- For the unit scalar 1€ K, lw=wu for any vector u€& Va 


63 


64 “VECTOR SPACES AND SUBSPACES [CHAP. 4 


The above axioms naturally split into two sets. The first four are only concerned with 
the additive structure of V and can be summarized by saying that V is a commutative group 
(see Appendix B) under addition. It follows that any sum of vectors of the form 


Ot Rg AYE Ban eas 4 


requires no parenthesis and does not depend upon the order of the summands, the zero 
vector 0 is unique, the negative —u of wu is unique, and the cancellation law holds: 


u+t+w=vt+w implies u=v 
for any vectors u,v,w EV. Also, subtraction is defined by 
i VE EY) 
On the other hand, the remaining four axioms are concerned with the “action” of the 


field K on V. Observe that the labelling of the axioms reflects this splitting. Using these 
additional axioms we prove (Problem 4.1) the following simple properties of a vector space. 


Theorem 4.1: Let V be a vector space over a field K. 
(i) For any scalar kE K and 0E€V, k0=0. 
(ii) For 0€K and any vector wEV, 0u=0. 
(iii) If ku =0, where kG K and we V, then k=0 or u=0. 
( 


iv) For any scalar k © K and any vector u€ V, (—k)u = k(—u) = —ku. 


EXAMPLES OF VECTOR SPACES 


We now list a number of important examples of vector spaces. The first example is a 
generalization of the space R”. 


Example 4.1: Let K be an arbitrary field. The set of all n-tuples of elements of K with vector 
addition and scalar multiplication defined by 


(Q1, Gg, .. +, Ay) Be (by, bo, ...., Dy) > (ay + by, dg + bg, ..., Gy + by) 
and To(Gj,, Qa; v5 yp) =e (LO Ont cuca) 


where a;,6;,k © K, is a vector space over K; we denote this space by K". The zero 
vector in K” is the n-tuple of zeros, 0 = (0,0,...,0). The proof that K” is a vector 
space is identical to the proof of Theorem 1.1, which we may now regard as stating 
that R” with the operations defined there is a vector space over R. 


Example 4.2: Let V be the set of all m X n matrices with entries from an arbitrary field K. Then 
V is a vector space over K with respect to the operations of matrix addition and 
scalar multiplication, by Theorem 3.1. 


Example 4.3: Let V be the set of all polynomials ay + a,t + dot? + +++ + a,t” with coefficients a; 
from a field K. Then V is a vector space over K with respect to the usual operations 
of addition of polynomials and multiplication by a constant. 


Example 4.4: Let K be an arbitrary field and let X be any nonempty set. Consider the set V of all 
functions from X into K. The sum of any two functions f,g © V is the function 
f+g€V defined by 

(f+ g)(") = f(x) + g(a) 


and the product of a scalar k@K and a function f€V is the function kf € V 


defined b 
. (kf)(~) = k f(a) 


CHAP. 4] 


Example 455: 


SUBSPACES 


VECTOR SPACES AND SUBSPACES 65 


Then V with the above operations is a vector space over K (Problem 4.5). The zero 
vector in V is the zero function 0 which maps each x € X into 0€K: 0(x) = 0 
for every x EX. Furthermore, for any function f € V, —f is that function in V 
for which (—f)(«) = —f(«), for every «€ X. 


Suppose # is a field which contains a subfield K. Then E can be considered to be a 
vector space over K, taking the usual addition in E to be the vector addition and 
defining the scalar product kv of kG K and v€E to be the product of k and v 
as element of the field H. Thus the complex field € is a vector space over the real 
field R, and the real field R is a vector space over the rational field Q. 


Let W be a subset of a vector space over a field K. W is called a subspace of V if W is 
itself a vector space over K with respect to the operations of vector addition and scalar 
multiplication on V. Simple criteria for identifying subspaces follow. 


Theorem 4.2: W is a subspace of V if and only if 


(i) 


W is nonempty, 


(ii) W is closed under vector addition: v,w € W implies v + w & W, 


(iii) W is closed under scalar multiplication: v€ W implies kv € W for 


every KEK. 


Corollary 4.3: W is a subspace of V if and only if (i) 0€ W (or W#®), and (ii) v,w EW 
implies av + bw € W for every a,bE K. 


Example 4.6: 


Example 4.7: 


Example 4.8: 


Let V be any vector space. Then the set {0} consisting of the zero vector alone, and 
also the entire space V are subspaces of V. 


(i) Let V be the vector space R?. Then the set W consisting of those vectors whose 
third component is zero, W = {(a,b,0): a,b © R}, is a subspace of V. 


(ii) Let V be the space of all square n X n matrices (see Example 4.2). Then the 
set W consisting of those matrices A =(a;;) for which aj =a;;, called 
symmetric matrices, is a subspace of V. 


(iii) Let V be the space of polynomials (see Example 4.3). Then the set W consisting 


—— 


of polynomials with degree = n, for a fixed n, is a subspace of V. 


(iv) Let V be the space of all functions from a nonempty set X into the real field R. 
Then the set W consisting of all bounded functions in V is a subspace of V. 
(A function f € V is bounded if there exists M€R_ such that |f(x)| = M for 


every « € X.) 


Consider any homogeneous system of linear equations in n unknowns with, say, real 
coefficients: 


41%, ae AyoXo qe OOo ae AinXn 0 
Ag1%} ae Ag0%o a OOO ie AgnXn = 0 
Ami1%1 ar Am2%o Ae DO Se Ann®n = 0 


Recall that any particular solution of the system may be viewed as a point in R*. 
The set W of all solutions of the homogeneous system is a subspace of R” (Problem 
4,16) called the solution space. We comment that the solution set of a nonhomo- 
geneous system of linear equations in » unknowns is not a subspace of R”. 


66 VECTOR SPACES AND SUBSPACES [CHAP. 4 


Example 4.9: Let U and W be subspaces of a vector space V. We show that the intersection 
UNW is also a subspace of V. Clearly 0 € U and 0 € W since U and W are sub- 
spaces; whence 0 € UNW. Now suppose u,v € UNW. Then u,v € U and u,v © W 
and, since U and W are subspaces, 


au+bv € U and au+ bv © W 


for any scalars a,b © K. Accordingly, aut+bv€G UNW and so UNW is a sub- 
space of V. : 


The result in the preceding example generalizes as follows. 


Theorem 4.4: The intersection of any number of subspaces of a vector space V is a 
subspace of V. 


LINEAR COMBINATIONS, LINEAR SPANS 


Let V be a vector space over a field K and let vi,...,Um € V. Any vector in V of the 
form 
CQiVivc= U20oiae + AnVm 
where the ai€ K, is called a linear combination of v1,...,Um. The following theorem 
applies. 


Theorem 4.5: Let S be a nonempty subset of V. The set of all linear combinations of 
vectors in S, denoted by L(S), is a subspace of V containing S. Further- 
more, if W is any other subspace of V containing S, then L(S) CW. 


In other words, L(S) is the smallest subspace of V containing S; hence it is called the 
subspace spanned or generated by S. For convenience, we define L(Q®) = {0}. 


Example 4.10: Let V be the vector space R?, The linear span of any nonzero vector wu consists 
of all scalar multiples of uw; geometrically, it is the line through the origin and the 
point u. The linear space of any two vectors u and v which are not multiples of 
each other is the plane through the origin and the points uw and v. 


Example 4.11: The vectors e,; = (1,0,0), e. = (0,1,0) and e3 = (0,0,1) generate the vector space 
R’, For any vector (a,b,c) € R® is a linear combination of the ei; specifically, 


(a,b,c) = a(1,0,0) + (0,1, 0) + ¢(0, 0, 1) 


= ae, + beg + ceg 


Example 4.12: The polynomials 1, ¢, ¢, t8,... generate the vector space V of all polynomials 
(int): V=L(1,t,#,...). For any polynomial is a linear combination of 1 and 
powers of ¢. 


CHAP. 4} VECTOR SPACES AND SUBSPACES 67 


Example 4.13: Determine whether or not the vector v = (3,9, —4, —2) is a linear combination of 
the vectors u, = (1, —2, 0, 3), Ug = (2, 8,0, —-1) and uz = (2, —1, 2,1), ie. belongs 
to the space spanned by the Uj» 


Set v as a linear combination of the u;, using unknowns 2, y and 2; that is, set 
V = LU + YU + 2Us3: 


(3, 9, —4,-2) = wx(1, —-2,0,3) + y(2, 3,0, —-1) + 2(2, —1, 2, 1) 
= (#%+2y + 2z, —2x + 8y — 2, 22, 8a —y + 2) 


Form the equivalent system of equations by setting corresponding components equal 
to each other, and then reduce to echelon form: 


AP OA) Sey g a 8} sie PAN) ai Py 3 BaP PAY epee sR 
— Hip SS Yeo) {OY a Bes SS US) Papas Ss 
or or 
2 ae a 
Yi = Mae B= =O) = be == lal VS 
OB =P) APs Se 
or Use Bye = lls 
2a. —E—4 


Note that the above system is consistent and so has a solution; hence v is a linear 
combination of the u;. Solving for the unknowns we obtain x =1, y = 3, z = —2. 
Thus v = uj, + 8ug — 2us. 


Note that if the system of linear equations were not consistent, i.e. had no solu- 
tion, then the vector v would not be a linear combination of the x;. 


ROW SPACE OF A MATRIX 
Let A be an arbitrary m X n matrix over a field K: 


Qi1 21 Qin 
A te 21 Q22 don 
Ami Am2 Amn 
The rows of A, 
Ri= (d11, OPA os aon Qin), tacks Rn (Ami, m2, --+5 mn) 


viewed as vectors in K”, span a subspace of K" called the row space of A. That is, 


row spaceof A = L(Ri, Ro, ..., Rm) 


Analogously, the columns of A, viewed as vectors in K™, span a subspace of K™ called the 
column space of A. 
Now suppose we apply an elementary row operation on A, 
(i) Rio Rj, (ii) Riz kRi, k~O0, or (ili) Ri; > kR;y+ Bi 
and obtain a matrix B. Then each row of B is clearly a row of A or a linear combination of 
rows of A. Hence the row space of B is contained in the row space of A. On the other 


hand, we can apply the inverse elementary row operation on B and obtain A; hence the row 
space of A is contained in the row space of B. Accordingly, A and B have the same row 


space. This leads us to the following theorem. 


Theorem 4.6: Row equivalent matrices have the same row space. 


We shall prove (Problem 4.31), in particular, the following fundamental result con- 
cerning row reduced echelon matrices. 


68 VECTOR SPACES AND SUBSPACES [CHAP. 4 


Theorem 4.7: Row reduced echelon matrices have the same row space if and only if they 
have the same nonzero rows. 
Thus every matrix is row equivalent to a unique row reduced echelon matrix called its 
row canonical form. 
We apply the above results in the next example. 


Example 4.14: Show that the space U generated by the vectors 
uy == (1p 241,83); wm = (2,41, —2),. sands iy — (6) 0,.3,— 7) 
and the space V generated by the vectors 
Vi = (i; 2, =A, 11) and Vg = (2, 4, =), 14) 
are equal; that is, U=V. 


Method 1. Show that each wu; is a linear combination of v, and v2, and show that 
each v; is a linear combination of u,, v2 and uz. Observe that we have to show that 
six systems of linear equations are consistent. 


Method 2. Form the matrix A whose rows are the u;, and row reduce A to row 
canonical form: 


1 ell 3 1 Bh =k 3 1 i =il 3 
At 2 4 1 =2) we 0 0 Bo =e to | 90 0 a = 
3 6 B= 7 0 9 6 —16 0 0 0 0 
1 2 0 1/3 
to 0 0 {3/8} 
0 0 0 0 
Now form the matrix B whose rows are v; and Vg, and row reduce B to row canonical 
form: 
1 Roe Gl 1 PAL Til 1 2 0 1/3 
Be to to 
ic A= +) G 0 3 fe) & 0 1 i 


Since the nonzero rows of the reduced matrices are identical, the row spaces of A 
and B are equal and so U=YV. 


SUMS AND DIRECT SUMS 


Let U and W be subspaces of a vector space V. The sum of U and W, written U + W, 
consists of all sums wu + w where wEU and wEW: 


UW. a, ae no a CL Ue eaves 


Note that 0=0+0€U+W, since 0€U,0€W. Furthermore, suppose u+w and 
u’ + w’ belong to U+ W, with u,uw’ EU and w,w’ €W. Then 


(ew) Pa ee) a ete (200 ee ee ey 
and, for any scalar k, kut+w) = kutkw € U+W 
Thus we have proven the following theorem. 


Theorem 4.8: The sum U+W of the subspaces U and W of V is also a subspace of V. 


Example 4.15: Let V be the vector space of 2 by 2 matrices over R. Let U consist of those 
matrices in V whose second row is zero, and let W consist of those matrices in V 
whose second column is zero: 


v= (52): oven, ow = (28): wcenl 


CHAP. 4] VECTOR SPACES AND SUBSPACES 69 


Now U and W are subspaces of V. We have: 


b 
ny ae a ; as ONE: 
U W 1¢ y adcer| and Can) 1(¢ a : cer} 


That is, U+ W consists of those matrices whose lower right entry is 0, and UNW 
consists of those matrices whose second row and second column are zero. 


Definition: The vector space V is said to be the direct sum of its subspaces U and W, 


denoted by % = 
Ss BW 


if every vector v € V can be written in one and only one way as v=u+w 
where wEU and we W. 
The following theorem applies. 


Theorem 4.9: The vector space V is the direct sum of its subspaces U and W if and only 
if: (i) V=U+W, and (ii) UNW = {0}. 


Example 4.16: In the vector space R3, let U be the «xy plane and let W be the yz plane: 
U = X(a, 6, 0) a, bE R} “and Wi = {(0,.b).c): b,e GR} 


Then R? = U+ W since every vector in R3 is the sum of a vector in U and a vector 
in W. However, R® is not the direct sum of U and W since such sums are not 
unique; for example, 


(3, 5, 7) = (3, 1, 0) ai (0, 4, 7) and also (3, 5, 7) = (3, —4, 0) = (0, 9, 7) 


Example 4,17: In R®, let U be the xy plane and let W be the z axis: 
Us—=i(a7b,.0) nano Rs rand se Wes —{(0;0Nc)isge-S Ry 


Now any vector (a,b,c) © R® can be written as the sum of a vector in U and a 
vector in V in one and only one way: 


(a, b,c) = (a, b, 0) + (0, 9, c) 
Accordingly, R? is the direct sum of U and W, that is, R? = U @ W. 


Solved Problems 


VECTOR SPACES 
4.1. Prove Theorem 4.1: Let V be a vector space over a field K. 


(i) For any scalar ke K and 0€V, k0=0. 

(ii) For 0€K and any vector u€ VG. =O: 

(iii) If ku=0, where kEK and wEeV, then k=0 or u=0. 
(iv) For any ke&K and any we V, (—k)u = k(—u) = — ku. 


(i) By axiom [A,] with w = 0, we have 0+0 = 0. Hence by axiom [M,], kO = k(0 + 0) = 
ko + k0. Adding —k0 to both sides gives the desired result. 

(ii) By a property of K, 0+ 0 = 0. Hence by axiom [M,], Ou = (0+ 0)u = Ou + Ou. Adding — 0u 
to both sides yields the required result. 


70 


4.2. 


4.3. 


4.4. 


4.5. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


(iii) Suppose ku =0 and k#0. Then there exists a scalar k~! such that k-1k = 1; hence 
= Ne == (ba te ee (hu) ie Oe 


(iv) Using w+ (—u) = 0, we obtain 0 = k0 = k(u + (—u)) = ku+k(—u). Adding —ku to both 
sides gives —ku = k(—uw). 


Using k + (—k) = 0, we obtain 0 = 0u = (k + (—k))u = ku + (—k)u. Adding —ku to both 
sides yields —ku = (—k)u. Thus (—k)u = k(—u) = —ku. 


Show that for any scalar k and any vectors u and v, k(u—v) = ku— kv. 


Using the definition of subtraction (w—v=u+(—v)) and the result of Theorem 4.1(iv) 
(k(—v) = —kv), 
k(u-—v) = k(ut+(—v)) = kutk(—v) = ku + (-kv) = ku —kv 


In the statement of axiom [M2], (a + b)wu = aw + bu, which operation does each plus 
sign represent? 

The + in (a+ b)u denotes the addition of the two scalars a and 6; hence it represents the addi- 
tion operation in the field K. On the other hand, the + in au-+ bw denotes the addition of the two 


vectors au and bu; hence it represents the operation of vector addition. Thus each + represents a 
different operation. 


In the statement of axiom [Ms], (ab)u=a(bu), which operation does each product 
represent? 


In (ab)u the product ab of the scalars a and b denotes multiplication in the field K, whereas the 
product of the scalar ab and the vector w denotes scalar multiplication. 


In a(bu) the product bu of the scalar 6 and the vector u denotes scalar multiplication; also. the 
product of the scalar a and the vector bu denotes scalar multiplication. 


Let V be the set of all functions from a nonempty set X into a field K. For any func- 
tions f,g © V and any scalar k EK, let f+g and kf be the functions in V defined 
as follows: 

(f+ o)(x) = J@) Hole) and (hi\@) >= hi), Vx EX 


(The symbol V means “for every’’.) Prove that V is a vector space over K. 


Since X is nonempty, V is also nonempty. We now need to show that all the axioms of a vector 
space hold. 


[A,|: Let f,g,h€©V. To show that (f+9) +h =f+(g+h), it is necessary to show that 
the function (f+ g)+h and the function f+(g+h) both assign the same value to each 
«EX. Now, 


(f+ 9) + h)(x) 
(f + (9g + h))(x) 


II 


(f+ g)(@) + h(x) = (f(x) + g(a)) + h(x), WeEex 
fw) + (9+ hye) = f(a) + (g(a) + h(w)), WeEex 


lI 


But f(x), g(z) and h(x) are scalars in the field K where addition of scalars is associative; hence 
(f(a) + g(a)) + h(@) = f(x) + (g(x) + h(ax)) 
Accordingly, (f+g) +h =f+(g+h). 
[A,]: Let 0 denote the zero function: 0(x) = 0, Wx € X. Then for any function f € V, 
(F+O)@) = f(x) +0) = fa) +0 = fe),  Vaex 
Thus f +0 =f, and 0 is the zero vector in V. 


CHAP. 4} VECTOR SPACES AND SUBSPACES 71 


[A;]: For any function f € V, let —f be the function defined by (—f)(x) = — f(x). Then, 
EMG) = fG) + (-f@) = 7@) fe): = 0 = 02), Vee x 
Hence f + (—f) = 0. 
[A4]: Let f,g€V. Then 
(f+g)(@) = f(x) + o(~) = g(x) + f(z) = (gt+f\(x), WeEx 


Hence f+g = 9g + f. (Note that f(a) + g(a) = g(a) + f(x) follows from the fact that f(x) and 
g(x) are scalars in the field K where addition is commutative.) 


[M,|: Let f,g&V and KEK. Then 
(k(f+9))(x) = K(ft+g)(x)) = k(f(e) + g(x)) = kf(w) + kg(a) 
= (kf)(x) + (kg)(~) = (kf+kg)(x), VeEeXx 


Hence k(f+g) = kf + kg. (Note that k(f(a) + g(a)) = kf(x) + kg(x) follows from the fact that 
k, f(x) and g(x) are scalars in the field K where multiplication is distributive over addition.) 


[M.]: Let f&V and a,b eK. Then 
(a + b)f)(x) (a+ b)f(x) = af(a) + bf(~) = (af)(a) + bf(z) 
= (af + bf)(x), Wx EXx 


Hence (a+ b)f = af + Of. 
[M3]: Let f&V and a,b€ K. Then, 
((ab)f)(~) = (ab)f(x) = a(bf(x)) = a(bf)(~) = (a(df))(~), WxeEXx 
Hence (ab)f = a(bf). 
[M,]: Let f€V. Then, for the unit 1E€K, (1f)(x) =1f(«) =f(x), We EX. Hence if =f. 


Since all the axioms are satisfied, V is a vector space over K. 


4.6. Let V be the set of ordered pairs of real numbers: V = {(a,b): a,b © R}. Show 
that V is not a vector space over R with respect to each of the following operations 
of addition in V and scalar multiplication on V: 


(i) (a,b) + (c,d) = (at+c,b+d) and k(a, b) = (ka, b); 
(ii) (a, b) + (c, d) = (a, b) and k(a, b) = (ka, kb); 
(iii) (a, b) + (c,d) = (ate, b+4d) and k(a, b) = (ka, kb). 
In each case show that one of the axioms of a vector space does not hold. 
(i) Let r=1, s=2, v = (3,4). Then 
(r+s)v = 3(8,4) = (9, 4) 
ru + sv = 1(3, 4) + 2(8, 4) = (8, 4) + (6,4) = (9, 8) 


Since (r+s)v # rv + sv, axiom [My] does not hold. 


(ii) Let v = (1,2), w= (3,4). Then 
v+tw = (1, 2) + @, 4) 


w+v = (8,4) + (1, 2) 


(1, 2) 
(3, 4) 


I} 


Since v + w # w+, axiom [A,] does not hold. 
(iii) Let r=1, 8=2, v= (3,4). Then 
(r+s)v = 3(8,4) = (27, 36) 
rv + sv = 1(8, 4) + 2(3,4) = (8, 4) + (12,16) = (15, 20) 


Thus (r+s)v # rv + sv, and so axiom [M,] does not hold. 


TZ VECTOR SPACES AND SUBSPACES [CHAP. 4 
SUBSPACES 
4.7. Prove Theorem 4.2: W is a subspace of V if and only if (i) W is nonempty, (ii) v,w € W 
implies v+w€W, and (iii) v€ W implies kv € W for every scalar ke K. 
Suppose W satisfies (i), (ii) and (iii). By (i), W is nonempty; and by (ii) and (iii), the operations 
of vector addition and scalar multiplication are well defined for W. Moreover, the axioms [Aj], [A4], 
[M,], [M,], [M5] and [M,] hold in W since the vectors in W belong to V. Hence we need only show 
that [A,] and [A3] also hold in W. By (i), W is nonempty, say «© W. Then by (iii), 0w =OE W 
and v+0=v for every v€ W. Hence W satisfies [A,]. Lastly, if v€@ W then (—1)v=—veW 
and v+(—v) =0; hence W satisfies [A3]. Thus W is a subspace of V. 
Conversely, if W is a subspace of V then clearly (i), (ii) and (iii) hold. 
4.8. Prove Corollary 4.3: W is a subspace of V if and only if (i)0€W and (ii) 1,w EW 
implies av+bw € W for all scalars a,b € K. 
Suppose W satisfies (i) and (ii). Then, by (i), W is nonempty. Furthermore, if v,w€ W then, 
by (ii), vtw=1lv+lweEW; andif v@W and kEK then, by (ii), ku =kv+0vEW. Thus 
by Theorem 4.2, W is a subspace of V. 
Conversely, if W is a subspace of V then clearly (i) and (ii) hold in W. 
4.9. Let V=R*. Show that W is a subspace of V where: 
(i) W= {(a,b, 0): a,b ER}, ie. W is the zy plane consisting of those vectors whose 
third component is 0; 
(ii) W = {(a,b,c): a+b+c=0}, ie. W consists of those vectors each with the 
property that the sum of its components is zero. 
(i) 0 = (0,0,0) € W since the third component of 0 is 0. For any vectors v = (a, b,0), w = 
(c, d, 0) in W, and any scalars (real numbers) k and k’, 
koe kas — vk (ab, 0) iat ue, 0) 
(ka;-kb; 0) + (kee kid,0) s = 7-(ka 4+. k’e, kb > k'd,0) 
Thus kv+k’w € W, and so W is a subspace of V. 
(ii) 0=(0,0,0) €W since 0+0+0=0. Suppose v = (a,b,c), w =(a’,b’,c’) belong to W, i.e. 
a+b+c=0 and a’+b’+c’=0. Then for any scalars k and k’, 
OP ae BOD = IBC WO) se IEMGS OS @ 
= Ui ode) sp WE Gl I i’ 
(Kise io OD cit On ice Ky Ca) 
and furthermore, 
(ka t+ k’a’) + (kb+k’b’) + (ke+k’e’) = katb+c) + k(a’+b’ +c’) 
= 0S VE) SS) 
Thus kv+k’w € W, and so W is a subspace of V. 
4.10. Let V =R°*. Show that W is not a subspace of V where: 


(i) W= {(a, b,c): a=0}, ie. W consists of those vectors whose first component is 
nonnegative; 

(ii) W= {(a, b,c): a +b?+c=1}, ie. W consists of those vectors whose length does 
not exceed 1; 

(ili) W = {(a, b,c): a,b,c € Q}, ie. W consists of those vectors whose components are 
rational numbers. 


In each case, show that one of the properties of, say, Theorem 4.2 does not hold. 


(i) v=(1,2,8)€W and K=—-5ER. But kv = —5(1,2,3) = (—5,—10, —15) does not belong to 
W since —5 is negative. Hence W is not a subspace of V. 


CHAP. 4] VECTOR SPACES AND SUBSPACES 73 


(ii) v=(1,0,0)€ W and w= (0,1,0)€W. But v+w = (1,0,0) + (0,1,0) = (1,1,0) does not 
belong to W since 12+12+02=2>1. Hence W is not a subspace of V. 


(iii) v = (1,2, 8) EW and k= V2ER. But kv = V2 (1,2,3) = (V2, 2V2, 3V2) does not belong to 
W since its components are not rational numbers. Hence W is not a subspace of V. 


4.11. Let V be the vector space of all square n Xn matrices over a field K. Show that W 
is a subspace of V where: 


(i) W consists of the symmetric matrices, i.e. all matrices A = (aij) for which 
Aji = Aj; 


(ii) W consists of all matrices which commute with a given matrix T: that is 
W={AEV: AT=TA}. 


(i) O€W since all entries of 0 are 0 and hence equal. Now suppose A = (a;;) and B = (b;)) 
belong to W, ie. aj =a, and 6b; =, For any scalars a,b €K, aA+bB is the matrix 
whose ij-entry is aa, + bb; But aa; + bb; = aa,; + bb, Thus aA + bB is also symmetric, 


and so W is a subspace of V. 


’ 


(ii) O€W since 0J =0=T0. Now suppose A,BEW; that is, AT = TA and BT=TB. For 
any scalars a,b € K, 


(@aA+bB)T = (a@A)T + (0B)T = a(AT) + B(BT) = a(TA) + O(TB) 
= T(aA) + T(bB) = T(aA+bB) 
Thus aA + bB commutes with 7, i.e. belongs to W; hence W is a subspace of V. 


4.12. Let V be the vector space of all 2 X 2 matrices over the real field R. Show that W 
is not a subspace of V where: 


(i) W consists of all matrices with zero determinant; 
(ii) W consists of all matrices A for which A?= A. 


10 0 0 
(i) (Recall that act (? z, = ad — be.) The matrices A = € a and B = & 4 belong 


150 A 
to W since det(A) =0 and det(B)=0. But A+B = 0 =| does not belong to W since 
det (A +B) =1. Hence W is not a subspace of V. 


10 3 
(ii) The unit matrix I = (4 “ belongs to W since 
poe TOWN Zl a0 as 1 0 ZS 
QA al 0 1 
20 : 
Biuteel 0 a does not belong to W since 


SONI ON poco ALO 
ONS ia & ae :) - e 2) nine 


Hence W is not a subspace of V. 


4.13. Let V be the vector space of all functions from the real field R into R. Show that W 
is a subspace of V where: 
(i) W= ff: f(8) = 9}, ie. W consists of those functions which map 3 into 0; 
(ii) W={f: f(7) =f())}, Le. W consists of those functions which assign the same 
value to 7 and 1; 
(iii) W consists of the odd functions, i.e. those functions f for which f(—%) = —f(#). 


74 


4.14. 


4.15. 


4.16. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


Here 0 denotes the zero function: 0(x) =0, for every x € R. 

(i) O€W since 0(3)=0. Suppose f,g EW, ie. f(83)=0 and g(3)=0. Then for any real 
a eet (af + bg)(3) = af(3) +-b9(3) = a0 + 60 = 0 
Hence af+bg € W, and so W is a subspace of V. 

(ii) 0€W since 0(7)=0=0(1). Suppose f,g EW, ie. f(7) = f(1) and g(7) = g(1). Then, for 


any real numbers a and b, 
(af + bg)(7) = af(7) + bg(7) = af(1) + bg) = (af + bg)(1) 
Hence af+bg € W, and so W is a subspace of V. 


(iii) O€ W since 0(—x) = 0 = —0 =—O(x). Suppose f,gEW, ie. f(—x) =—f(x) and g(—x) = 
— g(x). Then for any real numbers a and b, 
(af + bg)(—x) = af(—a) + bg(—a) = —af(”) — bg(x) = —(af(x) + bg(x)) = — (af + bg)(x) 


Hence af+bg € W, and so W is a subspace of V. 


Let V be the vector space of all functions from the real field R into R. Show that W 
is not a subspace of V where: 


GQ) We {FID = 2 Fly}; 


(ii) W consists of all nonnegative functions, i.e. all function f for which f(x) =0, 


Vx ER. 
(i) Suppose f,g€W, ie. f(7) =2+4+f(1) and g(7)=2+ (1). Then 
(a) G) = 3 fC tog) = ae ieee ote) 
= 4+ f() 4 90) t= 4G Fo) 2) 2255 Gato) 


Hence f+g¢€W, and so W is not a subspace of V. 


(ii) Let kK=-—2 and let fEV be defined by f(x) =22. Then fEW since f(x) =22=0, 
VxeER. But (kf)(5) = kf(5) = (—2)(52) = —50 < 0. Hence kf € W, and so W is not a sub- 
space of V. 


Let V be the vector space of polynomials ap + ait + aot? + --- + ant" with real coef- 

ficients, i. a; € R. Determine whether or not W is a subspace of V where: 

(i) W consists of all polynomials with integral coefficients; 

(ii) W consists of all polynomials with degree = 3; 

(iii) W consists of all polynomials bo + bit? + bot* + --- + bnt?", i.e. polynomials with 
only even powers of ft. 

(i) No, since scalar multiples of vectors in W do not always belong to W. For example, v = 


3+ 5¢+7? © W but jv = aoe St ae E42 é€ W. (Observe that W is “closed” under vector 
addition, i.e. sums of elements in W belong to W.) 


(ii) and (iii), Yes. For, in each case, W is nonempty, the sum of elements in W belong to W, and 
the scalar multiples of any element in W belong to W. 


Consider a homogeneous system of linear equations in m unknowns %1,...,%n over a 
field K: 


1101 + Aie%2 +--++ + Gintn = 0 
Go1%1 + AosoX%2 + °:+ + Aontn = 0 
Ami®1 + Ame%e + +++ + Amntn = 0 


Show that the solution set W is a subspace of the vector space K". 
0=(0,0,...,0)& W since, clearly, 
4,0 + a0 + +-- + a,,0 = 0, fore ei— haere 


CHAP. 4} é VECTOR SPACES AND SUBSPACES 75 


Suppose wu = (uw, Uy, ..., U_) and v= (Vy, Vg, ..., Um) belong to W, ie. for i=1,...,m 
AjiUy4 + Ajgus + seie —- DinUn a) 
QV, + Aide +++: +ainvy, = 0 
Let a and b be scalars in K. Then 
pea boy iS Yan: au + bv = (auy+ bry, augt bro, ..., au, + bv) 
a;y(auy, + bv,) + Ajp(AUy + bv») ese = Ain (AUy == bv,) 
A(AjyUy + Ajgly + +++ + Ainttn) + b(ajyV4 + GigQV2 + +++ + GinVa) 
= Ta0" + (00) =,0 


Hence au + bv is a solution of the system, i.e. belongs to W. Accordingly, W is a subspace of K™. 


LINEAR COMBINATIONS | 
4.17. Write the vector v = (1,—2,5) as a linear combination of the vectors e: = (t,% 1), 
é2 = (1,2,8) and es = (2,—1,1). 
We wish to express v as v = xe, + yeg + ze3, with , y and z as yet unknown scalars. Thus 


we require 
(1, 4p 5) — x(1, 1, 1) ate y(1, 2, 3) = 2(2, oat 1) 


= (x, xv, x) =P (y, 2y, 3y) ar (22, —2, 2) 
= (GPS POTS RZ ae CN Pa Ba) 


Form the equivalent system of equations by setting corresponding components equal to each other, 
and then reduce to echelon form: 


Z Map Osprey sS Gl CSS Ol agree ok Mae Ya ae Sl 
ear ya or WH 88 = 8 or UES 6 S983 
map ese Bs oS Ppa 7 BS Tl 52a) 


Note that the above system is consistent and so has a solution. Solve for the unknowns to obtain 
x=-—6, y=3,z2=2. Hence v = —6e, + 8eq + 2e3. 


4,18. Write the vector v = (2,—5,3) in R® as a linear combination of the vectors ¢:= 
(L—3, 2), 2 = (2,—4,—1) and es = (= 57)t 
Set v as a linear combination of the e; using the unknowns ~, y and z: v = xe, + Yég + 2€3. 
(2,5; 3).) =. elas sie) oh (2, —4, —1)6+ 2(1 7 —5; 7) 
= i(e 2yi 2, aot — 4y — 02, 2¢ — y+ 12) 


Form the equivalent system of equations and reduce to echelon form: 


eae VAI smart te ae (pa PAUL ae. ae oy? Cay ee 
—38e — 4y —5z = —5 or PA ree al or 27.— 225 = 1 
24— yttz = 838 bya OZ — —L 0O= 3 


The system is inconsistent and so has no solution. Accordingly, v cannot be written as a linear com- 
bination of the vectors ¢;, é, and eg. 


4.19. For which value of k will the vector u = (1,—2,k) in R® be a linear combination of 
the vectors v = (8,0,—2) and w = (2,—1,—5)? 
Set uw = wv + yw: 
(1-2, k) (= 2(8,0-=2) + y(2, —1, —5) = (Se + 2y, —y, —2x” — dy) 
Form the equivalent system of equations: 
8a+2y = 1, -y = -2, —-2e¢-—by =k 
By the first two equations, « = —1, y = 2. Substitute into the last equation to obtain k = —8. 


76 


4.20. 


4.21. 


4.22. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


Write the polynomial v = #2 +4t—3 over R as a linear combination of the poly- 
nomials e: = t?—2t+5, es = 2t2—38t and e =t+3. 
Set v as a linear combination of the e; using the unknowns «, y and 2: v = we; + Yeg + Zé3. 
2+ 4-38 = aw(t2—2t+5) + y(2t2—3t) + 2(t+8) 
at? — Qut + 5a + 2yt? — 8yt + zt + 32 
= (x+2y)t? + (—2x—3y+z)t + (5x + 32) 


Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form: 


Ke PAY) = 2 il Ce ZY = il wat ZY = il 
=p ei ae Hg =3 ah or Var CaS or Vee = 
5x ap Ors = 8) —l0y + 32.= —8 132. = 52 
Note that the system is consistent and so has a solution. Solve for the unknowns to obtain 
C= —on yi 2,2 — 4 hus 0 be ee Aen 
: , a liga | ; Nees : 
Write the matrix E = ol as a linear combination of the matrices A = 


eel 0 0 Qe 
é ee i and go a. 


Set # as a linear combination of A,B,C using the unknowns w,y,z: EH = «A + yB+ zC. 
sl 2 lige Foal 0 Oo Ogee 
( a “(4 ye v(4 def ona 
rs ON a 0 = a 0-22 = ( Ey $3 45 oD 
x. 0 y yy Q) = CRU p= 


Form the equivalent system of equations by setting corresponding entries equal to each other: 


Be = SB, Baro SS Il, ligase Oe == Mp W= B= = 


Substitute «= 3 in the second and third equations to obtain y=—2 and z=-—1. Since these 
values also satisfy the last equation, they form a solution of the system. Hence # = 3A — 2B —C. 


Suppose uw is a linear combination of the vectors v1,...,U¥m and suppose each v; is a 
linear combination of the vectors wi, ..., Wn: 
UM = Q1V1 + Ave + +++ +Qn¥m and Vv: = Duwi + Dewe +--+: + DinWn 


Show that u is also a linear combination of the w:. Thus if SCL(T), then L(S)CL(T). 
U = AyVy + Apvg + ->- + AnVin 

4 (byyWy Fore H Dy_Wy) + Ay(BoyWy + +++ + DonWpy) + +++ + Gm(bm1W1 + +** + BinnWn) 

(@ by, + Agdoy + 02+ + Obi), + 22> + (4yBy_ + Ggbon + °°: + Gm dmn)Wn 


™m m n n m 
or simply Caries = i eee 2 G e bs ) =o = ( > ab Uo 
= {= j= a 


I| 


v 


LINEAR SPANS, GENERATORS 
4.23. Show that the vectors u = (1,2,3), v = (0, 1,2) and w=(0,0,1) generate R°. 


We need to show that an arbitrary vector (a,b,c) € R3 is a linear combination of u,v and w. 
Set (a,b,c) = aut yv + zw: 
(a, 6; ¢) =~ x«(1,.2,3) + 4 (0,1, 2) 20, 051) — (%, 2a + y, 8x” + 2y + z) 


CHAP. 4] VECTOR SPACES AND SUBSPACES ee 


4.24, 


4.25. 


4.26. 


Then form the system of equations 


be i Can AY ov ==" -C 
20'4- y = 6 or YIP Bly 
Stat oy tye = 6 a 


The above system is in echelon form and is consistent; in fact « = a, y—b—2a,z=c—22b+a 
is a solution. Thus u, v and w generate R3, 


Find conditions on a, b and ¢ so that (a, b,c) € R® belongs to the space generated by 
u = (2,1,0), v=(1,-1,2) and w= (0,8, —4), 


Set (a, b,c) as a linear combination of u, v and w using unknowns 2, y and z: (a,b,c) = 
eu + yv + zw. 


(a,b,c) = «#(2,1,0) + y(1, —1, 2) + AOS 3 SN) SS ACS 8 ar Be Pt 
Form the equivalent system of linear equations and reduce it to echelon form: 
PaaS | = 4 22 + y Cy HR 4) =) 
oY) ap aye eS) 10) or 3y — 62 = a— 2b or 3y.— 62 = a—2b 
2Yi—"AZ — 6 PHY) — UY eG 0) =| 20 —40'—3¢ 


The vector (a,b,c) belongs to the space generated by uw, v and w if and only if the above system is - 
consistent, and it is consistent if and only if 2a—4b—38c=0. Note, in particular, that uw, v and 
w do not generate the whole space R?®. 


Show that the zy plane W = {(a,b,0)} in R®? is generated by wu and v where: (i) w= 
(1,2,0) and v=(0,1,0); (ii) «= (2,-1,0) and v=(1,3, 0). 


In each case show that an arbitrary vector (a,b,0)€@ W is a linear combination of u and v. 


(3) erect (Gs. 0,0) ear eyo: 


(A050) (1,2, 0) ey (0510) =" (a; 2a 7,0) 
Then form the system of equations 
x =" DO} sie PAP. SS 1b 
Ore) or i = I 
== 1) 


The system is consistent; in fact «=a, y= b-—2a isa solution. Hence u and v generate W. 


(ii) Set (a,6,0) = wut yv: 


(a, 6,0) = 2x(2,—1,0) + y(1,3,0) = (2a+y, —x + 3y, 0) 
Form the following system and reduce it to echelon form: 
20- ¥Y = @ 2ea+y=a4 
—x+3y = b or (0p = GSE PAL 
0 =-0 


The system is consistent and so has a solution. Hence W is generated by u and v. (Observe 
that we do not need to solve for x and y; it is only necessary to know that a solution exists.) 


Show that the vector space V of polynomials over any field K cannot be generated by 


a finite number of vectors. 


Any finite set S of polynomials contains one of maximum degree, say m. Then the linear span 
L(S) of S cannot contain polynomials of degree greater than m. Accordingly, V ~L(S), for any 


finite set S. 


78 


4.27. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


Prove Theorem 4.5: Let S be a nonempty subset of V. Then L(S), the set of all 
linear combinations of vectors in S, is a subspace of V containing S. Furthermore, if 
W is any other subspace of V containing S, then L(S) CW. 


If v€S, then 1v =vE€L(S); hence S is a subset of L(S). Also, L(S) is nonempty since S is 
nonempty. Now suppose v,w€L(S); say, 


OS OF ae 28% 42 Cheb and WY = TD Sie C8 Se Gib, 
where v;,w;€S and a;,b; are scalars. Then 
Ow =) ayy Fe agian + Oy Sey a ey, 
and, for any scalar k, 
kv = Kayo, + 00° Gm¥m) = kayo ot + han tm 


belong to L(S) since each is a linear combination of vectors in S. Accordingly, L(S) is a subspace 
of V. 


Now suppose W is a subspace of V containing S and suppose v,...,Um €SCW. Then all 
multiples @,V1,..-,@mUm € W, where a,€K, and hence the sum aj¥j+ +++ +@mUm © W. That 
is, W contains all linear combinations of elements of S. Consequently, L(S)C W as claimed. 


ROW SPACE OF A MATRIX 
4.28. Determine whether the following matrices have the same row space: 


4.29. 


1 -1 -1 
je Soma 1-1 -2 
a=(, iy 2 aes ee CASO ea 
33 Sik 3} 
Row reduce each matrix to row canonical form: 
y, = ik Sak S355 ‘Leap fl: ena : Ome 
a : oats e {ie Ae ies CO eatin 
1 —1 —2 1 -—1 -2 jest al 
B = t t 
te ~2 Ls i ie 1 4 ; eB 1 3) 
1 -1 —-1 1-1 —-1 1-1 -1 oe" waar 
C 8 4 —3 —1 to OS eS to On Wale Rees to (Ute le, « 
3-1. 3 ORR PEG (we in Or S00 


Since the nonzero rows of the reduced form of A and of the reduced form of C are the same, 
A and C have the same row space. On the other hand, the nonzero rows of the reduced form of B 
are not the same as the others, and so B has a different row space. 


Consider an arbitrary matrix A = (ai). Suppose u=(bi,..., bn) is a linear com- 
bination of the rows Ri, ...,Rmof A;say w=khiRit+--:-+kmRm. Show that, for each 
1, bi = kau t kod + +++ +kmdmi Where ai,..., Gm are the entries of the ith column 
of A. 


We are given u = k,R, + --: + k,,R»3 hence 
(b;, Soe b,) = ky (a4, eels An) GS CROC lop Gants Acs) (bere) 
= (yay. ot Ra Ginty ee Kym + *** + Km Omn) 


Setting corresponding components equal to each other, we obtain the desired result. 


CHAP. 4} VECTOR SPACES AND SUBSPACES 79 


4.30. 


4.31. 


Prove: Let A = (ai) be an echelon matrix with distinguished entries Cj, dig, ssn Ori 
and let B= (bij) be an echelon matrix with distinguished entries Dik, Dok., « - » Osk,: . 
Q1j, * * * * * & Dik, * ¥ * KOK OK 
Cj, Bee ~ Dox * * * * H 
fe a Ole PEE re rere . Boa [octet 
Arj, * * Dsk, * 


Suppose “ and B have the same row space. Then the distinguished entries of A and 
of B are in the same position: j; = ki, j2 = ko, ..., d; = Kr-and 7 = 8. 


Oe A=0 if and only if B=0, and so we need only prove the theorem when r= 1 
and s=1. We first show that j,; =k, Suppose j;<k,. Then the j,th column of B is zero. 
Since the first row of A is in the row space of B, we have by the preceding problem, a3, = 


Ci ObsieCy 0) orn C7 On 0 for scalars ¢; But this contradicts the fact that the distinguished 
element @;,7~ 0. Hence j; = ky, and similarly k, = j,. Thus 9; = ky. 


Now let A’ be the submatrix of A obtained by deleting the first row of A, and let B’ be the 
submatrix of B obtained by deleting the first row of B. We prove that A’ and B’ have the same 
row space. The theorem will then follow by induction since A’ and B’ are also echelon matrices. 


Let R = (a4,d9,...,a4,) be any row of A’ and let R,,...,R,, be the rows of B. Since R is in 
the row space of B, there exist scalars d,,...,d,, such that R = d,R,+d,R,+---+d,R,. Since 
A is in echelon form and R is not the first row of A, the j,th entry of R is zero: a,=0 for 
4= 9, =k,. Furthermore, since B is in echelon form, all the entries in the k,th column of B are 0 
except the first: bik, #0, but box, = Ose, mt, = 0. Thus 


Us = Oe = dybix, ae d,0 qe /oDa Se dm9 = dybix, 


Now bik, #0 andso d,;=0. Thus R is a linear combination of Ry,...,R, and so is in the row 


space of B’. Since R was any row of A’, the row space of A’ is contained in the row space of B’. 
Similarly, the row space of B’ is contained in the row space of A’. Thus A’ and B’ have the same 
row space, and so the theorem is proved. 


Prove Theorem 4.7: Let A= (aij) and B= (bij) be row reduced echelon matrices. 
Then A and B have the same row space if and only if they have the same nonzero rows. 


Obviously, if A and B have the same nonzero rows then they have the same row space. Thus 
we only have to prove the converse. 


Suppose A and B have the same row space, and suppose R # 0 is the ith row of A. Then there 
exist scalars c,,...,¢, such that 
er chy aie Coley ae Sao se Calin (1) 
where the R; are the nonzero rows of B. The theorem is proved if we show that R = R;, or 
¢,=1 but ¢ =0 for k#%. 
Let a;;, be the distinguished entry in R, i.e. the first nonzero entry of R. By (1) and Problem 4.29, 
4 


ij, = €b4;, an Cob9j. qe 0 ap Cbs; (2) 


But by the preceding problem bis, is a distinguished entry of B and, since B is row reduced, it is 


the only nonzero entry in the j;th column of B. Thus from (2) we obtain aj, = ¢ibi;,. However, 


Gyo= 1 and 6;; =1 since A and B are row reduced; hence ¢; = 1. 
1 v 


Now suppose k #i, and bij, is the distinguished entry in R,. By (1) and Problem 4.29, 


dij, = C16 4;, ae Cyb95, aE pene Cyd 5j, (3) 


80 


4,32. 


4.33. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


Since B is row reduced, bi iy, is the only nonzero entry in the j,th column of B; hence by (3), 
= C10 Kj, Furthermore, by the preceding problem Aijy, is a distinguished entry of A and, since 


Cae 
=O sens CO Kj, =0 and, since Ob in, =1, ¢, =0. Accordingly R=R; 


A is row reduced, jj, 
and the theorem is proved. 


Determine whether the following matrices have the same column space: 


133578 ees 
A’s 1 4 Si Re ee 
{SEO 7 12 17 


Observe that A and B have the same column space if and only if the transposes At and Bt have 
the same row space. Thus reduce At and B* to row reduced echelon form: 


1 1 a 1 il af 1 1 1 0 33 
ACW = bn 4: to Ore 2, to 0 —2 to 0 41-2 
Oy Roeee 0-2 4 Ogee OPO. (Oh ee Oe (0) 
Bl PA call P= 2h HF 1 \=2 ve 1 0 3 
Be = O38 22 to Oe ee to 0. ah 2 to 0 if oe 
38-4 °17 (Breed OO ay) 0 0 0 


Since At and Bt have the same row space, A and B have the same column space. 


Let R be a row vector and B a matrix for which RB is defined. Show that RB is a 
linear combination of the rows of B. Furthermore, if A is a matrix for which AB is 
defined, show that the row space of AB is contained in the row space of B. 


Suppose R = (a4,d9,...,4,) and B= (b,). Let By,,...,B,, denote the rows of B and 
B1,...,B" its columns. Then 


RB = (R°B!,R-B?,...,R°B") 
= (a4b44 ar Agbo4 a SOS Se On Opto a1b49 ac Adobo Ties ate OOo Staitsys aybiy =e Agdon ap Dao ae G0) 
= 4(by1, Oyo, «+, O4n) + (B91, boo, ..., bon) Aa hes Batis Am (Omi, Oy meee Bmn) 


— a,B, an AoBy Se OO Sie Ties) By 


Thus RB is a linear combination of the rows of B, as claimed. 

By Problem 3.27, the rows of AB are R,B where RF; is the ith row of A. Hence by the above 
result each row of AB is in the row space of B. Thus the row space of AB is contained in the row 
space of B. 


SUMS AND DIRECT SUMS 
4.34. Let U and W be subspaces of a vector space V. Show that: 


(i) U and W are contained in U+ W; 


(ii) U+W is the smallest subspace of V containing U and W, that is, U+ W is the 
linear span of U and W: U+ W = L(U, W). 


(i) Let «w€U. By hypothesis W is a subspace of V and so 0€ W. Hence u=ut+0EU+W. 
Accordingly, U is contained in U+ W. Similarly, W is contained in U+ W. 


(ii) Since U+ W is a subspace of V (Theorem 4.8) containing both U and W, it must also contain 
the linear span of U and W: L(U,W) Cc U+ W. 


On the other hand, if v€¢ U+W then v=ut+w=1u+lw where uG€U and wEeW; 
hence v is a linear combination of elements in UUW and so belongs to L(U,W). Thus 
U+WcCL(U,W). 


The two inclusion relations give us the required result. 


CHAP. 4} VECTOR SPACES AND SUBSPACES 81 


4.35. 


4.36. 


4.37. 


4.38. 


Suppose U and W are subspaces of a vector space V, and that {wi} generates U and 
{wj} generates W. Show that {u:, wj}, i.e. {ui} U {w3}, generates U+ W. 


a Let v ‘s U+ W. ae v =utw where uG@U and wE€W. Since {uj} generates U, wis a 
inear combination of w,’s; and since {w;} generates W, w is a linear combination of W'S: 


UuU= tah. ar Cae, Sp 2 Se Uigth a; IK 
WwW. = byw;, qr byw}, SP Oo ae bnWj» b; EK 
Thus V7 =O)- y) = Ayu, + Agui, torre + Ani, + b1w;, ai bow;, ara OF Sate bWj,_, 


and so {u;,w;} generates U + W. 


Prove Theorem 4.9: The vector space V is the direct sum of its subspaces U and W 
if and only if (i) V = U+ W and (ii) UNW = {0}. 


Suppose V = U@W. Then any vE€V can be uniquely written in the form v = u+w 
where u©U and w€W. Thus, in particular, V = U+ W. Now suppose v € UNW. Then: 


(1) v =v+0 where v€U,0EW; and (2) v =0+v where 0E€U,vEW 


Since such a sum for v must be unique, v = 0. Accordingly, UNW = {0}. 


On the other hand, suppose V = U+ W and UNW = {0}. Let vEV. Since V = U+W, 
there exist w€ U and w€ W suchthat v = u+w. We need to show that such a sum is unique. 
Suppose also that v = u’ + w’ where u’ GU and w’ GW. Then 


Oa Was Vea and so u—u = wi —w 
But uw—u’ EGU and w’—weEW; hence by UNW = {0}, 
Wi i) ae — 0 and so i Oa 


Thus such a sum for v€ V is unique and V = U@ W. 


Let U and W be the subspaces of R* defined by 
Um beC) C= De Cc} = and- ~ W =. {(0.-0,.¢)} 
(Note that W is the yz plane.) Show that R® = U ® W. 
Note first that UNW = {0}, for v = (a,b,c) © UNW implies that 


COs Cand) —10) which implies w= O, WSO, eS 


Tene —e(OnOnO)). 


We also claim that R? = U+ W. Forif v = (a,b,c) € R3, then v = (a, a, a) + (0, b—a, e—a) 
where (a,a,a)€U and (0,b—a,c—a)€W. Both conditions, UNW = {0} and R? = U+ W, 
imply R? = U@W. 


Let V be the vector space of n-square matrices over a field R. Let U and W be the 
subspaces of symmetric and antisymmetric matrices, respectively. Show that 
V=U@®W. (The matrix M is symmetric iff M=M*, and anti-symmetric iff 
Mt=—M,) 
We first show that V = U+ W. Let A be any arbitrary n-square matrix. Note that 
A = }(A+A‘) + HA-A4 


We claim that 3(A + At)€U and that 3(A — At)eEW. For 
(A(A+A))E = 4(A+A‘)E = HAt+ At) =1F(A + At) 


that is, $(A + A‘) is symmetric. Furthermore, 
({(A—A)t = HA-ADE = MAP—A) = —H(A-Ad 
that is, (A — A‘) is antisymmetric. 
We next show that UnW = {0}. Suppose ME UNW. Then M = Mt and Mt=—M which 
implies M=—M or M=0. Hence UNW = {0}. Accordingly, V = U@W. 


VECTOR SPACES AND SUBSPACES [CHAP. 4 


Supplementary Problems 


VECTOR SPACES 


4.39. Let V be the set of infinite sequences (a,,d,...) in a field K with addition in V and scalar multi- 
plication on V defined by 
COP Ree) eats CPR ns MIRC Hara beau ryea cx! 
k(a,, Ag, .- )) = (ka;, kag, oar a) 
where a;,b;,k € K. Show that V is a vector space over K. 
4.40. Let V be the set of ordered pairs (a, b) of real numbers with addition in V and scalar multiplication 
V defined b 
ae eee Oe ta. b) le, 0d). = fate, bade. and | kan) = shane) 
Show that V satisfies all of the axioms of a vector space except [M,]: lu =u. Hence [M,] is not a 
consequence of the other axioms. 

4.41. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R 
with addition in V and scalar multiplication on V defined by: 

(i) . (a, 6) + (c,d). = (&+d,b+c) and k(a,b) =, (ka, kb); 
Gi (e300) (Crd) (ase. bai.d)and= 1(a)b)e— 1(a,"0); 
(iii) (@, 6) + (c,d) = (0,0). and—sk(a, b) = (ka, kb); 

(iv) (a, b) + (c,d) = (ac, bd) and k(a,b) = -(ka, kb). 

4.42, Let V be the set of ordered pairs (z,, 2.) of complex numbers. Show that V is a vector space over the 
real field R with addition in V and scalar multiplication on V defined by 

(21, 2g) + (Wy, Wo) = (%1 + Wy, % + We) and (2, 2) = (kzy, kz) 
where 21, 2,W3,W2,EC and kER, 

4.43. Let V be a vector space over K, and let F' be a subfield of K. Show that V is also a vector space 
over F' where vector addition with respect to F' is the same as that with respect to K, and where 
scalar multiplication by an element k € F' is the same as multiplication by k as an element of K. 

4.44. Show that [A,], page 63, can be derived from the other axioms of a vector space. 

4.45. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u 
belongs to U and w to W: V = {(u,w): weEU, we W}. Show that V is a vector space over K 
with addition in V and scalar multiplication on V defined by 

(u, w) + (uw, w’) = (ut+u’, wt w’) and k(u, w) = (ku, kw) 
where u,u’CU, w,w’ EW and kEK. (This space V is called the external direct sum of U 
and W.) 

SUBSPACES 

4.46. Consider the vector space V in Problem 4.39, of infinite sequences (a1, dj, ...) in a field K. Show 
that W is a subspace of V if: 

(i) W consists of all sequences with 0 as the first component; 
(ii) W consists of all sequences with only a finite number of nonzero components. 

4.47. Determine whether or not W is a subspace of R® if W consists of those vectors (a,b,c) € R® for 
which: (i) a = 2b; (ii)a=b=ce; (iii) ab =0; (iv) a=b=c; (v) a= b2; (vi) kja+k.b+kse = 0, 
where k, ER. 

4.48. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if 


W consists of all matrices which are (i) antisymmetric (At = —A), (ii) (upper) triangular, 
(iii) diagonal, (iv) scalar. 


CHAP. 4] VECTOR SPACES AND SUBSPACES 83 


4.49. 


* 4.51. 


4.53. 


Let AX=B be a nonhomogeneous system of linear equations in » unknowns over a field K. 
Show that the solution set of the system is not a subspace of K”, 


Let Va be the vector space of all functions from the real field R into R. Show that W is a subspace 
of V in each of the following cases. 


(i) W consists of all bounded functions. (Here f:R—-R is bounded if there exists M€R such 
that |f(x)| =M, Wx ER) 


(ii) W consists of all even functions. (Here f:R->R is even if f(—x) = f(x), Wxe R.) 
(iii) W consists of all continuous functions. 

(iv) W consists of all differentiable functions. 

(v) W consists of all integrable functions in, say, the interval 0 =a =1. 


(The last three cases require some knowledge of analysis.) 
Discuss whether or not R?2 is a subspace of R3. 


Prove Theorem 4.4: The intersection of any number of subspaces of a vector space V is a subspace 
of V. 


Suppose U and W are subspaces of V for which UUW is also a subspace. Show that either. 
UCW or WCU. 


LINEAR COMBINATIONS 


4.54. 


4.55. 


4.56. 


Consider the vectors wu = (1,—3,2) and v = (2,—1,1) in R%. 


(i) Write (1, 7, —4) as a linear combination of w and v. 


(ii) Write (2, —5, 4) as a linear combination of u and v. 
(iii) For which value of k is (1, k, 5) a linear combination of wu and v? 


(iv) Find a condition on a, 6 and ¢ so that (a, b,c) is a linear combination of uw and v. 


Write uw as a linear combination of the polynomials v = 2t2+3t—4 and w=t2—2t—3 where 
(i) note St — De (Ll) @——4t> —6t —al. 


! : ern ee: Lipvel eee i ie il i oe i =i 
Write E as a linear combination of = iG ) ; = = 0 an = & ; 


z + Tete) Oo Remy a Pa ol 
where: (i) H = G eS ; (ii) # = ; as 


LINEAR SPANS, GENERATORS 


4,57. 


4.58. 


4.59. 


4.60. 


4.61. 


4.62. 


Show that (1, 1, 1), (0, 1, 1) and (0, 1, —1) generate R%, ie. that any vector (a, b,c) is a linear com- 
bination of the given vectors. 


Show that the yz plane W = {(0, 6, c)}_ in R3 is generated by: (i) (0, 1, 1) and (0, 2, —1); (ii) (0, 1, 2), 
(0, 2, 3) ‘and (0, 3,1). 


Show that the complex numbers w=2+3i and z=1-—27 generate the complex field C as a 
vector space over the real field R. 


Show that the polynomials (1—¢)3, (1—¢)?, 1—ft and 1 generate the space of polynomials of 


degree = 3. 


Find one vector in R3 which generates the intersection of U and W where U is the xy plane: 
U = {(a, b, 0)}, and W is the space generated by the vectors (12°83) -ande(1, —15 1). 


Prove: L(S) is the intersection of all the subspaces of V containing S. 


84 VECTOR SPACES AND SUBSPACES [CHAP. 4 


4.63. Show that L(S) = L(SU{0}). That is, by joining or deleting the zero vector from a set, we do not 
change the space generated by the set. 


4.64. Show that if SCT, then L(S) Cc L(T). 


4.65. Show that L(L(S)) = L(S). 


ROW SPACE OF A MATRIX 


4.66. Determine which of the following matrices have the same row space: 


ile ae Vera tipms 
AG yg) B= ( ate Cc =|2-1 10 
ogee i oi 2 tree 
4.67. Let Uy ey Ly 1), tig =) (2p By a ee ee) 


Use i al =a —8), Uo = (3, os =o). gs (2, it —3) 
Show that the subspace of R® generated by the u; is the same as the subspace generated by the 1}. 


4.68. Show that if any row of an echelon (row reduced echelon) matrix is deleted, then the resulting 
matrix is still in echelon (row reduced echelon) form. 


4.69. Prove the converse of Theorem 4.6: Matrices with the same row space (and the same size) are 
row equivalent. 


4.70. Show that A and B have the same column space iff At and Bt have the same row space. 


4.71. Let A and B be matrices for which AB is defined. Show that the column space of AB is contained 
in the column space of A. 


SUMS AND DIRECT SUMS 


4.72. | We extend the notion of sum to arbitrary nonempty subsets (not necessarily subspaces) S and T of 
a vector space V by defining S+T7 = {s+t: se€S, t€ T}. Show that this operation satisfies: 


(i) commutative law: S+T=T+S; 

(ii) associative law: (S; + So) + Ss = S, + (Sy + Ss); 
Gi) SSE KO = HOPES = Se 

(i) Sap WY = Wars = 


4.73. Show that for any subspace W of a vector space V, W+W = W. 


4.74. Give an example of a subset S of a vector space V which is not a subspace of V but for which 
(i) S+ S = S, (ii) S +S CS (properly contained). 


4.75. | We extend the notion of sum of subspaces to more than two summands as follows. If W;, Wo, ...,Wn 
are subspaces of V, then 
Wate 5 ee EW, — {wy + wet >->+w,: w; © W;} 
Show that: 


(i) L(W,, Wo, ..., Wn) = Wy + Wot+---+W,; 
(ii) if S; generates W,, i=1,...,n, then S;US,U---US, generates Wy + W,+---+W 


n° 


4.76. Suppose U, V and W are subspaces of a vector space. Prove that 
(UNV) + (UNW) c Un(V+W) 
Find subspaces of R? for which equality does not hold. 


CHAP. 4] 


4.77. 


4.80. 


4.81. 


4,82. 


4.51. 


4.54. 
4.55. 
4.56. 
4.61. 
4.66. 
4.67. 


4.74. 


4.77. 
4.78. 


4.81. 


VECTOR SPACES AND SUBSPACES 85 


Let U, V and W be the following subspaces of R3: 


U = (a,b,c): at+b+e=0}, V = {(a,b,c): a=c}, W = {(0,0,c): cE R} 


Show that (i) R? = U+V, (ii) R38 = U+ W, (iii) R? = V + W. When is the sum direct? 


Let V be the vector space of all functions from the real field R into R. Let U be the subspace of 
even functions and W the subspace of odd functions. Show that V = U @ W. (Recall that f is 
even iff f(—x) = f(x), and f is odd iff f(—2) =F (a2)s) 


Let W,, W2,... be subspaces of a vector space V for which W,C W.C::-. Let W=W,UW,U:-- 
Show that W is a subspace of V. ; 


In the preceding problem, suppose S; generates W,,i=1,2,.... Show that S = S,US,U::: 
generates W. i 


Let V be the vector space of n-square matrices over a field K. Let U be the subspace of upper 
triangular matrices and W the subspace of lower triangular matrices. Find (i) U + W, (ii) UNW. 


Let V be the external direct sum of the vector spaces U and W over a field K. (See Problem 4.45.) 


Let 


A 


0 = {(u,0): weU}, W = {(0,w): wEW} 


Show that (i) 0 and W are subspaces of V, (ii) V = O@W. 


(ii) 
No. 


Answers to Supplementary Problems 


Yes. (iv) Yes. 
No; eg. (1,2,8)€& W but —2(1,2,3) ZW. (v) No; eg. (9,3,0) EW but 2(9,3,0) ZW. 
No; e.g. (1,0,0), (0,1,0) € W, (vi) Yes. 


but not their sum. 


Let f,g © W with M, and M, bounds for f and g respectively. Then for any scalars a,b ER, 
\(af + bg)(x)| = laf(a«) + bg(a)| = |af(a)| + |bg(w)| = la] |F(x)| + |b] |g(~)| = |a|M, + |b|M, 
That is, |a|M, + |b|M, is a bound for the function af + bg. 


(af + bg)(—x) = af(—x) + bg(—x) = af(x) + bg(x) = (af + bg)(x) 


Although one may “identify” the vector (a,b) € R? with, say, (a,b,0) in the xy plane in R’, 


they are distinct elements belonging to distinct, disjoint sets. 


(i) —3u+2v. (ii) Impossible. (iii) k= —8. (iv) a—3b—5ce=0. 


(i) w=2v—w. (ii) Impossible. 


(i) EZ = 2A —B+2C:. (ii), Impossible. 


(2, —5, 0). 


A and C. 


Form the matrix A whose rows are the wu; and the matrix B whose rows are the v;, and then show 
that A and B have the same row canonical forms. 


(i) 
(ii) 


In R2, let S = {(0, 0), (0,1), (0, 2), (0, 3), aie 
TR lets S = 4(0, d), (056); (0,7 )50 oe 


The sum is direct in (ii) and (iii). 

Hint. f(x) = 4(f(«) + f(—#)) + $(f(@) — f(—2)), where (f(a) + f(—x)) is even and 4(f(x) — f(—2)) 
is odd. 

(i) V=U+W. (ii) UW is the space of diagonal matrices. 


Chapter 5 


Basis and Dimension 


INTRODUCTION 
Some of the fundamental results proven in this chapter are: 
(i) The “dimension” of a vector space is well defined (Theorem 5.3). 
(ii) If V has dimension n over K, then V is “isomorphic” to K" (Theorem 5.12). 


(iii) A system of linear equations has a solution if and only if the coefficient and 
augmented matrices have the same “rank” (Theorem 5.10). 
These concepts and results are nontrivial and answer certain questions raised and investi- 
gated by mathematicians of yesterday. 


We will begin the chapter with the definition of linear dependence and independence. 
This concept plays an essential role in the theory of linear algebra and in mathematics in 
general. 


LINEAR DEPENDENCE 


Definition: Let V be a vector space over a field K. The vectors v1,...,Um€V_ are’said 
to be linearly dependent over K, or simply dependent, if there exist scalars 
G1, ..:,@m €K, not all of them 0, such that 


C107 ste O30 ec On 0 (*) 


Otherwise, the vectors are said to be linearly independent over K, or simply 
independent. 


Observe that the relation (*) will always hold if the a’s are all 0. If this relation holds 
only in this case, that is, 
Q1V1 + Gove +--+: + GnUm = 0 onlyaie- a. —05. 2-6 


then the vectors are linearly independent. On the other hand, if the relation (*) also holds 
when one of the a’s is not 0, then the vectors are linearly dependent. 


Observe that if 0 is one of the vectors v1, ...,Um, Say v1 = 0, then the vectors must be 


dependent; for 
Lor +. Ove ee 00 SS SO EO ee 0S a 


and the coefficient of v1 is not 0. On the other hand, any nonzero vector v is, by itself, 
independent; for : ‘ 
kv =0, v¥0 implies 0 


Other examples of dependent and independent vectors follow. 
Example 5.1: The vectors u = (1,—1,0), v=(1,8,—1) and w = (5,3, —2) are dependent since, 
fOr Suc 20 = 105 
31150) iste 2 (Les ab)e (OMto 2) OO 0) 


86 


CHAP. 5] BASIS AND DIMENSION 87 


Example 5.2: We show that the vectors u= (6, 2,3, 4), v=(0,5,-38,1) and w= (0, 0, 7, —2) 
are independent. For suppose xu+yv+zw=0 where x,y and z are unknown 
scalars. Then 


(0, 0, 0, 0) = x(6, 2, 3, 4) Si y(0, 5, =}: 1) + 2(0, 0, 7, —2) 
(6x, 2% + 5y, 8a — 8y + Tz, 4a + y — 22) 


and so, by the equality of the corresponding components, 


6x = 0 
20 + 5y =" 0 
By = ae ty = 0) 
Agi yy — 22. —0 


The first equation yields «= 0; the second equation with x = 0 yields y=0; and 
the third equation with «=0, y=0 yields z=0. Thus 


Cw yo + 2, = 0 implies Co—10 et — 10 a0) 
Accordingly u,v and w are independent. 


Observe that the vectors in the preceding example form a matrix in echelon form: 


Oe AO re 
Ue Gs eetabes alk 
US oly (i 


Thus we have shown that the (nonzero) rows of the above echelon matrix are independent. 
This result holds true in general; we state it formally as a theorem since it will be frequently 
used. 


Theorem 5.1: The nonzero rows of a matrix in echelon form are linearly independent. 


For more than one vector, the concept of dependence can be defined equivalently as 
follows: 
The vectors v1, ..., Um are linearly dependent if and only if one of them is a linear 
combination of the others. 
For suppose, say, vi is a linear combination of the others: 


Vee Otis ai Via + Gait et Gain 


Then by adding —v; to both sides, we obtain 
QV. Hoes + Gi-1¥i-1 — Vi + Git1Vit1 + +++ + Ant¥m = O 
where the coefficient of v; is not 0; hence the vectors are linearly dependent. Conversely, 
suppose the vectors are linearly dependent, say, 
bivi t+ ++ + Oy; +--+ + OmUm = 0 where 0; ¥ 0 

Then We i Oi ey bye je Be OV b; Dm 
and so v; is a linear combination of the other vectors. 

We now make a slightly stronger statement than that above; this result has many im- 
portant consequences. 


Lemma 5.2: The nonzero vectors V1, ..., Um are linearly dependent if and only if one of 
them, say vi, is a linear combination of the preceding vectors: 


uy = -khivi + keve + 22+ + hi-10i-1 


88 BASIS AND DIMENSION [CHAP. 5 


Remark 1. The set {v1, ..., Um} is called a dependent or independent set aceording as the 
vectors V1, ..., Um are dependent or independent. We also define the empty 
set DY to be independent. 


Remark 2. If two of the vectors v1, ..., Um are equal, say v1 = 2, then the vectors are 


dependent. For V1 — V2 + 0U3 + +++ + 0m = 0 


and the coefficient of v1 is not 0. 


Remark 3. Two vectors v1 and v2 are dependent if and only if one of them is a multiple of 
the other. 


Remark 4. A set which contains a dependent subset is itself dependent. Hence any 
subset of an independent set is independent. 


Remark 5. If the set {v1, ..., vm} is independent, then any rearrangement of the vectors 
{Vi,, Vin) .. +, Vi} 18 also independent. 


Remark 6. In the real space R’, dependence of vectors can be described geometrically as 
follows: any two vectors u and v are dependent if and only if they lie on the 
same line through the origin; and any three vectors wu, v and w are dependent 
if and only if they lie on the same plane through the origin: 


uw and v are dependent. u, v and w are dependent. 


BASIS AND DIMENSION 
We begin with a definition. 


Definition: A vector space V is said to be of finite dimension n or to be n-dimensional, 
written dim V = a, if there exists linearly independent vectors ei, €2, ...,€n 
which span V. The sequence {é1, é2,...,é€n} is then called a basis of V. 


The above definition of dimension is well defined in view of the following theorem. 


Theorem 5.3: Let V be a finite dimensional vector space. Then every basis of V has the 
same number of elements. 


The vector space {0} is defined to have dimension 0. (In a certain sense this agrees with 
the above definition since, by definition, @ is independent and generates {0}.) When a 
vector space is not of finite dimension, it is said to be of infinite dimension. 


Example 5.3: Let K be any field. Consider the vector space K” which consists of n-tuples of ele- 
ments of K. The vectors 


Cpe 1 Oy Cine AO) 
G5) = ONTO etn tO) 
Gi 5(090, 0; sak Od) 


form a basis, called the usual basis, of K”. Thus K” has dimension n. 


CHAP. 5] BASIS AND DIMENSION 89 


Example 5.4: Let U be the vector space of all 2 x 3 matrices over a field K. Then the matrices 
K 0 i (Oras boas 5) Oy Gal 
OPO Oy) 0 0 >) ; , 0 0)? 
O00 OMBONeEO 0 0 0 
ORO.) OF te Oy as (3 0 a 


form a basis of U. Thus dimU=6. More generally, let V be the vector space 
of all m Xn matrices over K and let Ei;;© V_ be the matrix with ij-entry 1 and 0 


elsewhere. Then the set {E;;} is a basis, called the usual basis, of V (Problem 5.32); 
consequently dim V = mn. 


Example 5.5: Let Ww be the vector space of polynomials (in t) of degree = n. The set {1,t, t2,.. Agee 
is linearly independent and generates W. Thus it is a basis of W and so 
dimW =n+1. 


We comment that the vector space V of all polynomials is not finite dimensional 
since (Problem 4.26) no finite set of polynomials generates V. 


The above fundamental theorem on dimension is a consequence of the following im- 
portant “replacement lemma’’: 


Lemma 5.4: Suppose the set {v1, v2, ..., vn} generates a vector space V. If {wi, ..., Wm} 
is linearly independent, then m=vn and V is generated by a set of the form 
A000; anacG) Wa Why kG Coen 


Thus, in particular, any n+1 or more vectors in V are linearly dependent. 


Observe in the above lemma that we have replaced m of the vectors in the generating 
set by the m independent vectors and still retained a generating set. 


Now suppose S is a subset of a vector space V. We call {v1,...,Um} a maximal in- 
dependent subset of S if: 


(i) it is an independent subset of S; and 


(ii) {v1, ..., Um, W} is dependent for any w€/S. 


The following theorem applies. 


Theorem 5.5: Suppose S generates V and {v1, ..., Ym} is a maximal independent subset 
ofS. Then {v1, ...,¥m} is a basis of V. 
The main relationship between the dimension of a vector space and its independent 
subsets is contained in the next theorem. 


Theorem 5.6: Let V be of finite dimension n. Then: 
(i) Any set of n+1 or more vectors is linearly dependent. 
(ii) Any linearly independent set is part of a basis, i.e. can be extended to 


a basis. 
(iii) A linearly independent set with n elements is a basis. 


Example 5.6: The four vectors in K# 
Gils) (Ost, INS (OL 5 tk MD) OROs O; 1) 


are linearly independent since they form a matrix in echelon form. Furthermore, 
since dim K4 = 4, they form a basis of K+. 


Example 5.7: The four vectors in R?, 
(257, —132, 58), (43, 0, —17), (521, —317, 94), (328, —512, —731) 


must be linearly dependent since they come from a vector space of dimension 3. 


90 BASIS AND DIMENSION [CHAP. 5 


DIMENSION AND SUBSPACES 
The following theorems give basic relationships between the dimension of a vector space 
and the dimension of a subspace. 
Theorem 5.7: Let W be a subspace of an n-dimension vector space V. Then dimW =n. 
In particular if dimW=n, then W=V. 


Example 5.8: Let W be’a-subspace of the real space R°. Now dim R®? = 3; hence by the preced- 
ing theorem the dimension of W can only be 0, 1, 2 or 3. The following cases apply: 


(i) dim W = 0, then W = {0}, a point; 

(ii) dim W = 1, then W is a line through the origin; 
(iii) dim W = 2, then W is a plane through the origin; 
(iv) dim W = 8, then W is the entire space R?. 


Theorem 5.8: Let U and W be finite-dimensional subspaces of a vector space V. Then 
U + W has finite dimension and 


dim(U+V) = dimU + dimW — dim(UNW) 


Note that if V is the direct sum of U and W, ie. V = U®W, then dimV = 
dim U + dim W (Problem 5.48). 


Example 5.9: Suppose U and W are the xy plane and yz plane, respectively, in R?: U = {(a,6,0)}, 
W = {(0,6,c)}. Since R?=U+W, dim(U+W)=3. Also, dimU=2 and 
dim W=2. By the above theorem, 
3 = 25-2 — dim (Un Ww) or dim(UNW) = 
Observe that this agrees with the fact that UNW is the y axis, ie. UNW = 
{(0, b, 0)}, and so has dimension 1. 


RANK OF A MATRIX 

Let A be an arbitrary m Xn matrix over a field K. Recall that the row space of A is 
the subspace of K” generated by its rows, and the column space of A is the subspace of K™ 
generated by its columns. The dimensions of the row space and of the column space of A 
are called, respectively, the row rank and the column rank of A. 


Theorem 5.9: The row rank and the column rank of the matrix A are equal. 


Definition: The rank of the matrix A, written rank (A), is the common value of its row 
rank and column rank. 


Thus the rank of a matrix gives the maximum number of independent rows, and also 
the maximum number of independent columns. We can obtain the rank of a matrix as 
follows. 

1 2y 0° 1 
Suppose A =|2 6 —3 —3}. We reduce A to echelon form using the elementary 


row operations: 3 10 -6 —5 


CHAP. 5] BASIS AND DIMENSION ir 


1-2 0-1 Lace at 
AN StOr WON) 2.5 =8 > T°). to OGRGl ee ot 
0 4 -6 -2 OF502 20250 


Recall that row equivalent matrices have the same row space. Thus the nonzero rows of the 


echelon matrix, which are independent by Theorem 5.1, form a basi 
4, sis of th 
A. Hence that rank of A is 2. e row space of 


APPLICATIONS TO LINEAR EQUATIONS 


Consider a system of m linear equations in n unknowns V1, ...,%n over a field K: 
Oui sae Aigo st Cntr =) OF 
A21%1 + Aoo%. +-e+ + Aoantn = be 
Omit te Onto ie! ate Omni = Dm 


or the equivalent matrix equation 
AACSB 


where A = (aij) is the coefficient matrix, and X = (x) and B= (bj) are the column vectors 
consisting of the unknowns and of the constants, respectively. Recall that the augmented 
matrix of the system is defined to be the matrix 


A11 iz Cine) 04 

21 Age a b 
(A i B) = Qn 2 

Ami Ame Amn Or 


Remark 1. The above linear equations are said to be dependent or independent according 
as the corresponding vectors, i.e. the rows of the augmented matrix, are 
dependent or independent. 


Remark 2. Two systems of linear equations are equivalent if and only if the corresponding 
augmented matrices are row equivalent, i.e. have the same row space. 


Remark 3. We can always replace a system of equations by a system of independent 
equations, such as a system in echelon form. The number of independent 
equations will always be equal to the rank of the augmented matrix. 


Observe that the above system is also equivalent to the vector equation 


a1 12 Qin by 
Qe Ae a2 be 
v1 =i Bo 2 fe Bn m = By 
Am1 Am2 Amn Bm 


Thus the system AX =B has a solution if and only if the column vector B is a linear 
combination of the columns of the matrix A, i.e. belongs to the column space of A. This 
gives us the following basic existence theorem. 


Theorem 5.10: The system of linear equations AX =B has a solution if and only if the 
coefficient matrix A and the augmented matrix (A, B) have the same rank. 


92 BASIS AND DIMENSION [CHAP. 5 


Recall (Theorem 2.1) that if the system AX=B does have a solution, say v, then its 
general solution is of the form v + W = {v+w: w€W} where W is the general solution 
of the associated homogeneous system AX =0. Now W is a subspace of K" and so has a 
dimension. The next theorem, whose proof is postponed until the next chapter (page 127), 
applies. 


Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear 
equations AX =0 is n—r where n is the number of unknowns and r is 
the rank of the coefficient matrix A. 


In case the system AX = 0 is in echelon form, then it has precisely n—,r free variables 
(see page 21), say, 2i,,%i,,...,%i,_,- Let vj be the solution obtained by setting «i, =1, 
and all other free variables = 0. Then the solutions v1, ..., Un—r are linearly independent 
(Problem 5.43) and so form a basis for the solution space. 


Example 5.10: Find the dimension and a basis of the solution space W of the system of linear 


equations 
RAV ima BYP eS (t) 


Wp a> PN) = PB ap CAP ae 8 
ig ae OY) = me se Bip ge hs = () 


Reduce the system to echelon form: 

ise Mol = bese Bip = GB == 
Me pap wy == (L and then 
= Brae Gy = 


SPN et ey = al XN) 
22 —> rt 2s = 0 


There are 5 unknowns and 2 (nonzero) equations in echelon form; hence dimW = 
5—2 = 3. Note that the free variables are y,r and s._ Set: 


Gi) y=1, r=0, s=0, (ii) y=0, r=1, s=0, (iii) y=0, r=0, s=1 
to obtain the following respective solutions: 
Vy = (—2, if 0, 0, 0), V2 = (al 0, 4, lip 0), U3 = (-3, 0, aks 0, 1) 


The set {v1, Vo, v3} is a basis of the solution space W. 


COORDINATES 
Let {é1, ...,e@,} be a basis of an n-dimensional vector space V over a field K, and let v 
be any vector in V. Since {e;} generates V, v is a linear combination of the e;:: 
V = 1€1 + Azz + -++ + Ann, ack 
Since the e; are independent, such a representation is unique (Problem 5.7), i.e. the n 
scalars di, ..., Qn are completely determined by the vector v and the basis {ei}. We call 
these scalars the coordinates of v in {e:}, and we call the n-tuple (a1, ..., Qn) the coordinate 


vector of v relative to {ei} and denote it by [v]- or simply [v]: 
[v]e = (Gi, do, ..., An) 


Example 5.11: Let V be the vector space of polynomials with degree = 2: 
V = {at?+ bt+c: a,b,c € R} 
The polynomials 
Cia C5 it a). and €3) = (6 — 1) 2 = ora 


form a basis for V. Let v = 2t?—5t+6. Find [v],, the coordinate vector of v 
relative to the basis {€,, éo, é3}. 


CHAP. 5} BASIS AND DIMENSION 93 


Set v as a linear combination of the e; using the unknowns «, y and z: v = xe,+ 
Yeo + Ze. 
2t — 5t +6 “e(!) + y(@—1) + 2(t? — 2¢ + 1) 
x + yt —y + 2t? —22t+2z 


22 + (y—2z2)t + (a —y+ 2) 


II 


Then set the coefficients of the same powers of t equal to each other: 


Cat ike ee sO 
N= yes SS IS 
ie 


The solution of the above system is x = 3, y=—l1, z=2. Thus 


Y = Be, — & + 2es, and so wet =)(3)-—1, 2) 


Example 5.12: Consider the real space R3. Find the coordinate vector of v = (8, 1, —4) relative to 
the basis. f;/= (1,.1,,1), fo — ©; 1, Ds. fa = (04.0.1): 


Set v as a linear combination of the f; using the unknowns 2, y and z:° v= xf,+ 


Yfs + 2h. 
(GH aly ya XG il ab) se a ulead) ae A), x0), ab) 


(x, @,'\x) + (0, y, y) + (0, 0, 2) 
(a iy, eye) 


II 


Then set the corresponding components equal to each other to obtain the equivalent 
system of equations 


x =" 43 
Apap -O) = il 
eryte= —4 


having solution «=38, y=—2, z=—5. Thus [v]; = (8, —2, —5). 


We remark that relative to the usual basis e, = (1,0,0), eg = (0,1, 0), eg = 
(0, 0,1), the coordinate vector of v is identical to v itself: [v], = (8,1, —4) =v. 


We have shown above that to each vector v € V there corresponds, relative to a given 
basis {€1, ..., én}, an n-tuple [v]. in K". On the other hand, if (a1,...,@n) © K", then there 
exists a vector in V of the form aie1++-:+nén. Thus the basis {e;} determines a one-to- 
one correspondence between the vectors in V and the n-tuples in K". Observe also that if 


vy = i++: +@nen corresponds to (d1,..., Qn) 


and w = bye; + +++ + dnen corresponds to (bi, ..., bn) 


then 
vt+w = (art biler +++: +(dnt+bnjén corresponds to (a1,...,@n) + (b1,..., Dn) 


and, for any scalar k € K, 
ky = (kai)e: +--+: +(Kkan)én corresponds to k(d1,...,@n) 
That is, [v+wle = [vlet+ [w]e and [kvje = k[ve 


Thus the above one-to-one correspondence between V and K” preserves the vector space 
operations of vector addition and scalar multiplication; we then say that V and A” are 
isomorphic, written V = K". We state this result formally. 


Theorem 5.12: Let V be an n-dimensional vector space over a field K. Then V and K” are 
isomorphic. 


94 


BASIS AND DIMENSION [CHAP. 5 


The next example gives a practical application of the above result. 


Example 5.13: Determine whether the following matrices are dependent or independent: 


Tig Poe 8 Lea eee 3 8 ae 
= = ’ Cr 
. & 0 1) i Ce 5 i. Cis 103249 


The coordinate vectors of the above matrices relative to the basis in Example 
5.4, page 89, are 


VA al 22854 0a), (B] =. 0,8) =4 6; 5,4), [Cc] = (8, 8, —11, 16, 10, 9) 
Form the matrix M whose rows are the above coordinate vectors: 


1 Ph 33 4 0 i! 


Ae 2) 85nd) pO ee 1:9 2hieSin, en Oman! 
Me to.) 0) A HAS BARS) tow, Om See a ee 
0, 2 Sole 4 Oa ers \0:2 0: 20°20 Oy eo 


Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B] 
and [C] generate a space of dimension 2 and so are dependent. Accordingly, the 
original matrices A, B and C are dependent. 


Solved Problems 


LINEAR DEPENDENCE 


5.1. 


5.2. 


Determine whether or not wu and v are linearly dependent if: 
(Ge e="(3,/4)5-oo= (1-3) (ili) wu = (4,3,—2), v = (2, 6,7) 
= (2;-3), v = (6, —9) (iv) u = (—4,6,—2), » = (2,—8,1) 


ee Bd pte ee fio kof BORER es ee 
eee e ek a a ae < 5 oo a 


(vil) w= 2-5¢+62-—8, »v =3+42t— 42450 


| 


at 

= 
es 

fos 
~ 
| 


(vili) w= 1—8t +2? — 30, v = -3 +9t —6# +98 


Two vectors « and v are dependent if and only if one is a multiple of the other. 


(i) No. (ii) Yes; for v = 3u. (iii) No. (iv) Yes; for w= —2v. (v) Yes; for v = 2a= (i) Nor 
(vii) No. (viii) Yes; for v = —3u. 


Determine whether or not the following vectors in R? are linearly dependent: 
(ie aan) (alge Ly e(da 4 cak) (it); (4, 25-8), (1, =3,2)) (2, 1,5) 
(ii) (135.7), (2; 0, = 6), 43,11), (2, an) (iv) (2, —8,7), (0,0, 0), (8, —1, —4) 


(i) Method 1. Set a linear combination of the vectors equal to the zero vector using unknown 
scalars x, y and z: 


x(1, 2; 1) te y(2, il, il) ap a(7, aes 1) = (0, 0, 0) 


CHAP. 5] 


5.3. 


(ii) 


- (iii) 


(iv) 


BASIS AND DIMENSION 95 
Then (w, —2a, %) + (2y, y, —y) + (Tz, —4z,z) = (0,0, 0) 
or (+ 2y + Tz, 2x +y—4z,2—y+z2) = (0,0,0) 


Set corresponding components equal to each other to obtain the equivalent homogeneous system, 
and reduce to echelon form: 


ot 2st =" i0 mae Pp se Glee SS 0) 
RAS (et a) 
eee te Mae Aen = 9 () or SUien Oe — m0 or 
Marre = Y 
ig ae ie == (G) S/O) 


The system, in echelon form, has only two nonzero equations in the three unknowns; hence the 
system has a nonzero solution. Thus the original vectors are linearly dependent. 


Method 2. Form the matrix whose rows are the given vectors, and reduce to echelon form using 
the elementary row operations: 


I =) al i 2 IL 4 AL 
Zell to ORROR Ra to Obes 
Ne rl @ lO =€ Do Wo 


Since the echelon matrix has a zero row, the vectors are dependent. (The three given vectors 
generate a space of dimension 2.) 


Yes, since any four (or more) vectors in R® are dependent. 


Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 


ee ea Powers oe 
ee aey 24 to Om Oman to OF omen 
eres) 0) =) 1 On ROR 6 


Since the echelon matrix has no zero rows, the vectors are independent. (The three given vectors 
generate a space of dimension 3.) 


Since 0 = (0,0,0) is one of the vectors, the vectors are dependent. 


Let V be the vector space of 2 x 2 matrices over R. Determine whether the matrices 
A,B,C € V are dependent where: 


(ii 


(i) 


A 


| 
ae 
G2 a 
rN) 
ee 
w 
II 
a 
ND ow 

| 
ND eR 
Se 
eS) 
| 
eae 
| 

— 
Oo oO 
See 


Set a linear combination of the matrices A, B and C equal to the zero matrix using unknown 
scalars x, y and z; that is, sett tA +yB+zC=0. Thus: 


pi) to 1) * 40 0) = Ga) 
e Ga) oe 27) 


bint pee te = 0 5) 
or ( fs aty ae CON 0 


96 


5.4. 


(ii) 


BASIS AND DIMENSION [CHAP. 5 


Set corresponding entries equal to each other to obtain the equivalent homogeneous system of 


equations: 
Sch aig oe 


0 
fe 3 =S 
6 = 0 
ate = 


Solving the above system we obtain only the zero solution, «=0, y = 0, <=0. We have 
shown that vA +yB+2zC implies «=0, y=0, 2=0; hence the matrices A,B and C are 
linearly independent. 


Set a linear combination of the matrices A,B and C equal to the zero vector using unknown 
scalars x, y and z; that is, set «A + yB + 2C = 0. Thus: 

0520 

0 O 


Ay S =al 1 es 
x + y a 
ay eet Pie EPA “4b | {f) 
ae Pes (ee ==) ie O=EO 
+ = 
mt San ) Wegeteay Ge i) 0 0 
ae Bae 8 Pid = IY s UF (0) 
“4 Set 2y—4e w+ 2y By Mo: Soeth 


Set corresponding entries equal to each other to obtain the equivalent homogeneous system of 
linear equations and reduce to echelon form: 


io tani ge Bess Ag ap Nae SS WY 

PH — pea ye Se (V Siete = W 

BHP sie ray) — A= (I) oF sa ee = WY 

Moe 7A) = () i nO) 
or finally hy ae Ses ee = W) 
Vit) 


The system in echelon form has a free variable and hence a nonzero solution, for example, 
w«=2,y=-1,2z=1. We have shown that «A + yB+2zC = 0 does not imply that «=0, 
y = 0, 2 = 0; hence the matrices are linearly dependent. 


Let V be the vector space of polynomials of degree =3 over R. Determine whether 
u,v,w €&V are independent or dependent where: 


(i) 


(ii) w= +4? —-—2¢+3, v 


(i) 


U= 8 —3F bbl, v= Pit St +2 2h A Oro 
P46? —t+4, w= 36 +82 — st 7 


Set a linear combination of the polynomials u,v and w equal to the zero polynomial using 
unknown scalars «, y and z; that is, set xu + yu + zw = 0. Thus: 


o(t3 — 82+ 5t+1) + y(®— 2+ 8t4 2) + 2(2t32—4t2+9t+5) = 0 

or aot? — 80t2 + Bat +a t+ yt? — yt2 + 8yt + Qy + 2zt? — 42t2 + Ozt + Be = O 

or (~ + y + 22) + (—8e —y — 42) + (5a + By + 9z)t + (w+2y4+5z) = 0 
The coefficients of the powers of t must each be 0: 
A Pan pa rye ell) 
Soi pede = 
5a + 8y + 9z 0 
Go PAV ae sya =m) (8) 


lI 


Solving the above homogeneous system, we obtain only the zero solution: # = 05g) = OF 250: 
hence uw, v and w are independent. 


CHAP. 5] 


BASIS AND DIMENSION 


ot 


(ii) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using 


unknown scalars ~,y and z; that is, set xu+ yo 20. 0; Thus: 


a(t? + 4t? — 2t+ 3) + y(t? + 642—£+4) + AGE ses )) == (0 


or at? + dat? — 2at + 8a + yt? + Gyt? — yt + 4y + 32t3 + Bet? — B2t +7 = 0 


or (a + y + 82)t3 + (4a + 6y + 8z)t2 + (—2a — y — 8z)t + (8a 


+ dy+%z) = 0 


Set the coefficients of the powers of t each equal to 0 and reduce the system to echelon form: 


eo Vpar ey = (0 Cea aU alee 

Ao TON) aes. = Byj a= Ae = 

Se) it se) me VS Be = 

Ste a Car ie = (0) = = 
or finally War O45 82 = TC 
4 — 2ee—— 1) 


The system in echelon form has a free variable and hence a nonzero solution. We have shown 
that xu+yv+zw=0 does not imply that «=0, y=0, z=0; hence the polynomials are 


linearly dependent. 


5.5. Let V be the vector space of functions from R into R. Show that f,g,hEV are 
independent where: (i) f(t) = e*, g(t) = #, h(t) = t; (ii) f(t) = sint, g(f) = cost, 
Re). 


In each case set a linear combination of the functions equal to the zero function 0 using unknown 


scalars x,y and z: «wf +yg+zh=0; and then show that « =0, y=0, z=0. We emphasize that 
xf +yg+zh =0 means that, for every value of t, «f(t) + yg(t) + zh(t) = 0. 


(i) 


(ii) 


In the equation wxe2*+ yt2+ zt = 0, substitute 


oS) Wo@orin @vaev)as a0 =O Oe a= O 


t=) 1) to obtain) oxen yt 2 = 0 
b= A oymoehin ae cewly oe Pre = (i) 


x = 0 


Solve the system < we2+ y+ z = O to obtain only the zero solution: *«=0, y=0,z=0. 


wet + dy + 22 0 


Hence f,g and h are independent. 


Method 1. In the equation « sint + y cost + zt = 0, substitute 
a =O WO Coren MA) oe a/Oil == Ott) == XV) or 
t= 7/2, to obtain eae B40) 27/2) = 0 or 
(see to obtain 20! 4, YG) eer — 0) or 

y = 0 
Solve the system a2 + 72/2 = 0 to obtain only the zero solution: 
=o) 4 ae = 
f,g and h are independent. 


y= .0 
eye) = iM) 
—y+t+rz = 0 


6 — 0) yi — 02 = 03 Hence 


Method 2. Take the first, second and third derivatives of a sint + y cost + zt = 0 with 


respect to ¢ to get 
Cicoste—ayesin & =< — 0 


—xsint — ycost = 0 


| 
— 


—xeost + ysint 


(1) 


98 


5.6. 


5.7. 


5.8. 


5.9. 


BASIS AND DIMENSION [CHAP. 5 


Add (1) and (3) to obtain z = 0. Multiply (2) by sin t and (3) by cos t, and then add: 


sint X (2): —asin2t — ysintcost = 0 
cost X (3): —acos?t + ysintcost = 0 
—a(sin2 t + cos? t) = 0) Ore cs 0 


Lastly, multiply (2) by — cos t and (3) by sin t; and then add to obtain 
y(cos? tt + sin? t) = 0 or y= Wh 


Since ¢ sint + y cost + zt = 0 implies = yp = Al) 
f, g and h are independent. 


Let wu, v and w be independent vectors. Show that u+v, u—v and u—2v+w are 
also independent. 


Suppose a(ut+v) + y(u—v) + 2(u—2v+w) = O where x,y and z are scalars. Then 
NU CU Yili oe te = 0) Or 


(ety + zu + (¢—y— 22)0-+- zw — 0 
But uw, v and w are linearly independent; hence the coefficients in the above relation are each 0: 
Moet ae wi = 
GV — te = YY 
z= 0 


The only solution to the above system is x = 0,y =0,z2=0. Hence w+v, w—v and u—2v+w are 
independent. 


Let v1, V2, ...,Um be independent vectors, and suppose wu is a linear combination of 
the vi, say UW = Aiv1 + Gev2 + +++ + GmV¥m Where the a; are scalars. Show that the 
above representation of wu is unique. 


Suppose w = b,v, + bovo +--+ + by,v», Where the b; are scalars. Subtracting, 
OS 0 = (Ga Sa ar (Ca = Wnlee FP 2° BP (Gin SO Oe 
But the v; are linearly independent; hence the coefficients in the above relation are each 0: 
Ch = i ey SO, eth iy Se = 


Hence ay = bi, ay = bo, cows & 
of the v; is unique. 


m — 0, and so the above representation of u as a linear combination 


Show that the vectors v = (1+7, 27) and w=(1,1+7)inC are linearly dependent 
over the complex field C but are linearly independent over the real field R. 


Recall that 2 vectors are dependent iff one is a multiple of the other. Since the first coordinate 
of w is 1, v can be a multiple of w iff v =(1+7)w. But 1+7€R; hence v and w are independent 


over R. Since : , : es 
GSO (Oboe oy) == (bs, 0) =o 
and 1+7i€C, they are dependent over C. 


Suppose S = {v1,...,Um} contains a dependent subset, say {v1,...,vr}. Show that 
S is also dependent. Hence every subset of an independent set is independent. 


Since {v,,...,v,} is dependent, there exist scalars a,,...,a,, not all 0, such that 


OO a> Chie, Ie 299 a O20) = 


CHAP. 5} BASIS AND DIMENSION 99 


5.10. 


Hence there exist scalars Gq = 14), 052-0, not all“, such that 


OOP Bate NOU pasha ea OD ees () 
Accordingly, S is dependent. 


Suppose {V1, ...,Um} is independent, but {V1, ...,Um,W} is dependent. Show that w 
is a linear combination of the 2. 


Method 1. Since {v,,.. -;Um,W} is dependent, there exist scalars a4, .. -» Gm, 6, not all 0, such that 
AV, uaa 05 aS An Ym + bw = 0. If b =0, then one of the a, is not zero and a,¥,; + °°: + @,Um = 0. 
But this contradicts the hypothesis that {Vj,..-,Um} is independent. Accordingly, 6 #0 and so 

Ue BOE (0 Vase ma AmVm) = —b~layvy — +++ — bam Um 


That is, w is a linear combination of the 1;. 


Method 2. If w=0, then w = 0v, +--+ + 0v,,. On the other hand, if w 0 then, by Lemma 


5.2, one of the vectors in {v1,...,v,w} is a linear combination of the preceding vectors. This 
vector cannot be one of the v’s since {vy,...,V,,} is independent. Hence w is a linear combination 
of the v,. 

t 


PROOFS OF THEOREMS 


5.11. 


5.12. 


5.13. 


Prove Lemma 5.2: The nonzero vectors v1, ...,m are linearly dependent if and only 
if one of them, say vi, is a linear combination of the preceding vectors: v; = 
OU at aries 1. 
Suppose Vv; = AyV, =e pied + Q;—1V;-1- Then 
AV, i a Cet — VU; + Ov;44 ns 0Um ==) (9) 


and the coefficient of v; is not 0. Hence the v; are linearly dependent. 


Conversely, suppose the v; are linearly dependent. Then there exist scalars a,,...,@m, not all 
0, such that a,v;+-+::+4,Um =0. Let k be the largest integer such that a, #0. Then 


TO; Ap 28 = FOP SUG a pa SO ar Ua =D or Qi0y + 22> a0, — 0 


Suppose k=1; then a,v,;=0, a4,#0O and so v;=0. But the v; are nonzero vectors; hence 


k>1 and ; : 
Uk = Teh QAyV;, — a age =e Ok—-1VUk-1 


That is, v, is a linear combination of the preceding vectors. 


Prove Theorem 5.1: The nonzero rows R,...,R» of a matrix in echelon form are 
linearly independent. 
Suppose {R,, R,—1,---,R,} is dependent. Then one of the rows, say RF, is a linear combination 


of the preceding rows: 
Rn = Om+1tm+t1 + Gm+2hm+e +88 Gy lhe, (*) 


Now suppose the kth component of R,,, is its first nonzero entry. Then, since the matrix is in echelon 
form, the kth components of R,,+1,.--,R, are all 0, and so the kth component of (*) is G@n41°O0+ 
Am+2°0 +++: +a,°0 = 0. But this contradicts the assumption that the kth component of F,, is 


not 0. Thus R,,...,&, are independent. 


Suppose {v1,...,Um} generates a vector space V. Prove: 

(i) If weV, then {w,v,..., Um} is linearly dependent and generates ia 

(ii) If v;is a linear combination of the preceding vectors, then {%1, ..-, Vi-1, Vita +++) Um} 
generates V. 


(i) If weEV, then w is a linear combination of the »; since {v,} generates V. Accordingly, 
{W, V1, +++) Um} is linearly dependent. Clearly, w with the v; generate V since the v; by them- 


selves generate V. That is, {w,v,..-, Um} generates V. 


100 


5.14. 


5.15. 


5.16. 


BASIS AND DIMENSION [CHAP. 5 


(ii) Suppose v; = kyv, +--+ +k-1v;-1 Let we V. Since {v;} generates V, u is a linear com- 
bination of the v,;, say, u = av, + ++: + dpVm. Substituting for v;, we obtain 
U = Avy + 0+ + Gj Yj-1 + afkyvy + °° FA Yj—-1) F G41 %i41 + o°* F Om 


(a, + ajky)vy + 0+ + (ayy + Oki) ¥j—-1 + G41 %it1 Fo F Om m 


Thus {v1,..-,Vj—-1) Vit p> +++» Um} generates V. In other words, we can delete v; from the gen- 
erating set and still retain a generating set. 


Prove Lemma 5.4: Suppose {v1,...,Un} generates a vector space V. If {wi,...,Wm} 
is linearly independent, then m = n and V is generated by a set of the form 
{W1, ...,Wm, Vi, -.+)Vi,_m}-» Thus, in particular, any n+1 or more vectors in V are 
linearly dependent. 


It suffices to prove the theorem in the case that the v; are all not 0. (Prove!) Since the {v;} 
generates V, we have by the preceding problem that 


HOD» Ong coon Ory (1) 


is linearly dependent and also generates V. By Lemma 5.2, one of the vectors in (2) is a linear com- 
bination of the preceding vectors. This vector cannot be w,, so it must be one of the v’s, say 1;. 
Thus by the preceding problem we can delete v; from the generating set (1) and obtain the generating 
ey {W4, Vip ve es Vj—1 Vj4 11s ox) Dens (2) 
Now we repeat the argument with the vector wy. That is, since (2) generates V, the set 

LW 5'Wo; Vin veins. Opie Users a> Orr (3) 


is linearly dependent and also generates V. Again by Lemma 5.2, one of the vectors in (3) is a linear 
combination of the preceding vectors. We emphasize that this vector cannot be w, or we» since 
{wy , ...,Wm} is independent; hence it must be one of the v’s, say v,. Thus by the preceding problem 
we can delete vu; from the generating set (2) and obtain the generating set 


LW ip R00 Us ey Ope) Viet, aoe aan aera O eg 


We repeat the argument with wz and so forth. At each step we are able to add one of the 
w’s and delete one of the v’s in the generating set. If m=n, then we finally obtain a generating 
set of the required form: 
: } 


NO SSR iy Org acs 


1 ‘n—m 


Lastly, we show that m> ™m is not possible. Otherwise, after n of the above steps, we obtain 
the generating set {w,,...,w,}. This implies that w, +, is a linear combination of w,,...,w, which 
contradicts the hypothesis that {w;} is linearly independent. 


Prove Theorem 5.3: Let V be a finite dimensional vector space. Then every basis of 
V has the same number of vectors. 


Suppose {¢,é:,...,@,} is a basis of V, and suppose {f,,f,...} is another basis of V. Since 
{e;} generates V, the basis {f,, fy, ...} must contain n or less vectors, or else it is dependent by the 
preceding problem. On the other hand, if the basis {f,,f5,...} contains less than n vectors, then 
{e1,-..,@€,} is dependent by the preceding problem. Thus the basis {f,, fo, ...} contains exactly n 
vectors, and so the theorem is true. 


Prove Theorem 5.5: Suppose {v1,...,Um} is a maximal independent subset of a set 
S which generates a vector space V. Then {v1,..., Vm} is a basis of V. 


Suppose w€S. Then, since {v;} is a maximal independent subset of S, {V4, 20-5 Vt}. is 
linearly dependent. By Problem 5.10, w is a linear combination of the v,, that is, w € L(v,). Hence 
SCL(v;). This leads to V=L(S)CL(v) CV. Accordingly, {vj} generates V and, since it is in- 
dependent, it is a basis of V. 


CHAP. 5] BASIS AND DIMENSION 101 


5.17. 


5.18. 


5.19. 


5.20. 


Suppose V is generated by a finite set S. Show that V is of finite dimension and, in 
particular, a subset of S is a basis of V. 


Method 1. Of all the independent subsets of S, and there is a finite number of them since S$ is finite, 
one of them is maximal. By the preceding problem this subset of S is a basis of V. 


Method 2. TESS is independent, it is a basis of V. If S is dependent, one of the vectors is a linear 
combination of the preceding vectors. We may delete this vector and still retain a generating set. 


oe ee this process until we obtain a subset which is independent and generates V, ie. is a 
asis of V. 


Prove Theorem 5.6: Let V be of finite dimension n. Then: 
(i) Any set of n+1 or more vectors is linearly dependent. 
(ii) Any linearly independent set is part of a basis. 

(iii) A linearly independent set with n elements is a basis. 


Suppose {e,,...,¢,} is a basis of V. 


(i) Since {e;,...,e,} generates V, any n+1 or more vectors is dependent by Lemma 5.4. 


(ii) Suppose {v,,...,v,} is independent. By Lemma 5.4, V is generated by a set of the form 


—— {v, oeey Uny Gi er Cire} 


By the preceding problem, a subset of S is a basis. But S contains n elements and every basis 
of V contains n elements. Thus S is a basis of V and contains {v,,...,v,} as a subset. 


(iii) By (ii), an independent set T with n elements is part of a basis. But every basis of V contains 
n elements. Thus, T is a basis. 


Prove Theorem 5.7: Let W be a subspace of an n-dimensional vector space V. Then 
dimW =n. In particular, if dimW=n, then W=YV. 

Since V is of dimension n, any n+ 1 or more vectors are linearly dependent. Furthermore, since 
a basis of W consists of linearly independent vectors, it cannot contain more than nm elements. 
Accordingly, dim W = n. 

In particular, if {w,,...,w,} is a basis of W, then since it is an independent set with n elements 
it is also a basis of V. Thus W=V when dimW =n. 


Prove Theorem 5.8: dim(U+ W) = dimU + dim W — dim(UNW). 


Observe that UNW is a subspace of both U and W. Suppose dimU =m, dimW =n and 
dim (UN W) =r. Suppose {v,..-.,¥,} is a basis of UN W. By Theorem 5.6(ii), we can extend {v;} 


to a basis of U and to a basis of W; say, 
Luiyves ; Op ier 6 aati orh and LD Mee ot VE OU testa tg BON mt 
are bases of U and W respectively. Let 
RELY RE Ue tlgs) cae tine as Udy oe 5 Wert 


Note that B has exactly m+”—~r elements. Thus the theorem is proved if we can show that B is a 
basis of U+ W. Since {v,,u;} generates U and {v,, w,} generates W, the union B = {v;, uj, Wx} 
generates U+ W. Thus it suffices to show that B is independent. 


Suppose 
O40, be $F Oydy H dyty $e + Ome Umae + Cy + 00+ + Cp—p War = 0 (2) 


where aj, b;,¢, are scalars. Let 
MV = avy to>> + a,v, + byuy + 222 + Om —-Um—r (2) 


102 


5.21. 


BASIS AND DIMENSION [CHAP. 5 


By (1), we also have that 
UES Ve 11h ap ee eee et (3) 


Since {v,,u;} CU, v€U by (2); and since {w,} CW, vEW by (8). Accordingly, v€UN W. 


v 
Now {v;} is a basis of UN W and so there exist scalars d,,...,d, for which v = dyvjy +++ + A, Vy. 


Thus by (3) we have 
adyv1 = see Oh Ore ae CW, ap PO ae Cny—rVn—-r = 0 


But {v;,w,} is a basis of W and so is independent. Hence the above equation forces ¢,=0,..., 
Cn—-,=9. Substituting this into (1), we obtain 

G0, os > bia pe Oyu on Dy a align 
But {v;,u;} is a basis of U and so is independent. Hence the above equation forces a,=0,..., 
a,=0, b,=0, 09 bee 0% 


Since the equation (1) implies that the a;, 6; and ¢, are all 0, B= {v;, Uj,W,} is independent 
and the theorem is proved. 


Prove Theorem 5.9: The row rank and the column rank of any matrix are equal. 


Let A be an arbitrary m X n matrix: 


G1 2 Gin 
A~= M1 29 Gon 
Gm1 m2 Amn 
Let Ry, Ry,...,R, denote its rows: 
Ry, a (411, A190, +++, Qn); ceey Rn = (Qm1> Amar +++ Omn) 


Suppose the row rank is r and that the following 7 vectors form a basis for the row space: 
S, = (by1, bia, -- +) Bin), So = (621, B29, - ++, Bon)y -- +5 Sp = (Dp, Ora, «++» Den) 
Then each of the row vectors is a linear combination of the S;: 
Fy = KiaSa ceektoS oe ate oe hetog 
= KoiS1 + KooSo +2 --) te Ko,S, 


& 
| 


kn = kmS4 1 KinoSe b * =b kimpSy 


where the k;; are scalars. Setting the ith components of each of the above vector equations equal to 
each other, we obtain the following system of equations, each valid for 1=1,...,n: 


4; Kyyby, + kyobg, + +++ + hy,d,; 
Ag, = Koby, + Kygbg, + +++ + keg, b,j 


li 


Cee eee renee er ere rceeee eevee resenece 


Omni = Khmi0u + hmobo; + “bike Oe 
SPHUS shor mee —el yrpeete: 
ay; kyy Kyo ky, 
Bt) Fe obese aa) ook hg Ns Ne a seaman a 
Umi Kms Kime Km 


In other words, each of the columns of A is a linear combination of the r vectors 


Ky, Kyo ky, 
Koy ; ko Re era kp, 


Kem kms Kimr 


CHAP. 5} BASIS AND DIMENSION 103 


Thus the column space of the matrix A has dimension at most 7, i.e. column rank = r. Hence, column 
rank = row rank. 


Similarly (or considering the transpose matrix At) we obtain row rank = column rank. Thus 
the row rank and column rank are equal. 


BASIS AND DIMENSION 
5.22. Determine whether or not the following form a basis for the vector space R?: 
(i) (1, 1,1) and (1, -1, 5) (Gi) (ele) 2, 3)eand (2,-1.1) 


(ii) (1, 2, 3), (1,.0, —1), (3, —1, 0) (iv) (515.2), n(d 02,0) ind (5,.3,:4) 
and (2,12) 


(i) and (ii). No; for a basis of R? must contain exactly 3 elements, since R? is of dimension 3. 


(iii) The vectors form a basis if and only if they are independent. Thus form the matrix whose 
rows are the given vectors, and row reduce to echelon form: 


al agiay Real Leper sel uPA? glo mal 
2 Pa as} to Qo la? to Oe ale <2 
7a AL a ) 3) 1! OL Olea 5 


The echelon matrix has no zero rows; hence the three vectors are independent and so form a 
basis for R®. 


(iv) Form the matrix whose rows are the given vectors, and row reduce to echelon form: 


aie Bre ae) 1 oe Pee i baMap al 
1 Vieni) to Ve oak 6. 383 to Oe wis eek 
sate 0 —2 —-6 Ov Oe a0 


The echelon matrix has a zero row, i.e. only two nonzero rows; hence the three vectors are 
dependent and so do not form a basis for R?. 


5.23. Let W be the subspace of R* generated by the vectors (1, —2, 5, —3), (2, 3, 1, —4) and 
(3, 8, —3, —5). (i) Find a basis and the dimension of W. (ii) Extend the basis of W 


to a basis of the whole space R‘. 


(i) Form the matrix whose rows are the given vectors, and row reduce to echelon form: 


i, Reo rege 9 EDA e ES 925 1°-2 5 -8 
ete ae to a LOn a Iico, unt tou. |. Ob 7 OT 8.2 
3 8 -8 -5 0 14 -18 4 O40 S204 0 


The nonzero rows (1, —2, 5, —3) and (0, 7, —9, 2) of the echelon matrix form a basis of the row 
space, that is, of W. Thus, in particular, dim W = 2. 


(ii) We seek four independent vectors which include the above two vectors. The vectors (1, —2, 5, —3), 
(0,7, —9, 2), (0,0,1,0) and (0,0,0,1) are independent (since they form an echelon matrix), and 
so they form a basis of R* which is an extension of the basis of W. 


5.24. Let W be the space generated by the polynomials 
Ui = ot Bee 4b tL Us4— ty Gh 'b 
Usa 2 — ste 9b 1 ve = 20 —50 4+°714+5 
Find a basis and the dimension of W. 
The coordinate vectors of the given polynomials relative to the basis {t3, t2,t,1} are respectively 
[vi] = (1, —2, 4, 1) [vs] = (1, 0, 6, —5) 
[vs] = (2, —3, 9, —1) [v4] (PZ, 1," ) 


104 


5.25. 


5.26. 


BASIS AND DIMENSION [CHAP. 5 


Form the matrix whose rows are the above coordinate vectors, and row reduce to echelon form: 


Pseou Ak te (epee eG 5 ee 
2 Sar ghd Vp Ooi. Les ee neal Oa 3 
1 0 6-5 (Nap ene yea Os LONgNOURLO 
CR et Gee 0 +1 +1 8 00 ah Onin o 


The nonzero rows (1,—2,4,1) and (0,1,1,—8) of the echelon matrix form a basis of the space 
generated by the coordinate vectors, and so the corresponding polynomials 


#—2t4+4t+1 and #+t-38 
form a basis of W. Thus dim W = 2. 


Find the dimension and a basis of the solution space W of the system 
e+ 2y 4-22 —s+3t = 0 
g+2y+32+s+ ¢ = 0 
3x + 6y + 82 +s+5t = 0 
Reduce the system to echelon form: 
a+ 2y+2z-— s+3t = 0 


BAD OX <2 ORG SAY) or 
2z+ 4s —4t = 0 


Wp AN ae PRE SS BIE 
ap PAS = PAE 


OS 


The system in echelon form has 2 (nonzero) equations in 5 unknowns; hence the dimension of the 
solution space W is 5—2= 8. The free variables are y, s and t. Set 


(iy = 1, is =.0, 6'=.0,. > Gdiaye0, = 1, b= On 2 i Oe Oar et 


to obtain the respective solutions 


UN we (=2, ik 0, 0, 0), Va (5, 0, =A if 0), Usa Cue 0, 2, 0, 1) 


The set {v1, Vo, v3} is a basis of the solution space W. 


Find a homogeneous system whose solution set W is generated by 
(1, = 2; 0;:3),-(L ly 4 Oe 2a) 


Method 1. Let v = (x,y,z,w). Form the matrix M whose first rows are the given vectors and whose 
last row is v; and then row reduce to echelon form: 


I yO 98 i —2 0 3 1 -2 0 3 
=| = 0 1 —1 1 _ 
Woe 1-1-1 4 to ne (ye ak 1 1 
Ty ees iss 0 2 = 2 OS 100 20 sy ea ee 
A eh fe CF DY 0 22-4 z2 —8¢ + w 0 O 0 0 


The original first three rows show that W has dimension 2. Thus v€ W if and only if the addi- 
tional row does not increase the dimension of the row space. Hence we set the last two entries 
in the third row on the right equal to 0 to obtain the required homogeneous system 


2e+yte =) 
be iy =O) == 
Method 2. We know that v = (x,y,z,w) € W if and only if v is a linear combination of the gen- 


erators of W: 
(x,y, 2,w) = r(1, —2, 0, 3) + s(1, —1, —1,4) + &(1, 0, —2, 5) 


The above vector equation in unknowns r, s and t is equivalent to the following system: 


CHAP. 5} BASIS AND DIMENSION 105 


WetewSh-ts 9h =~ op PAR aS es) as Pete Sat te ea 
SIN 8 = y Sop Bi == Pane oy oP AG = BR Se) 
zach yo or Fee a or (1) 
be 2b, 2 ) = Base Ware 
oh As 5b = aw s+2t = w-— 3x OS 5ae yw 


Thus v € W if and only if the above system has a solution, i.e. if 
YAO Ss Apap Be = (y) 
by a — ww = 0 

The above is the required homogeneous system. 


Remark: Observe that the augmented matrix of the system (1) is the transpose of the matrix 
M used in the first method. 


5.27. Let U and W be the following subspaces of R?: 
OT (G,,070, 0) <0 6 d= 0} a We = (a, b,c, d) a +b = 0, ¢ 2d) 
Find the dimension and a basis of (i) U, (ii) W, (iii) UNW. 


(i) We seek a basis of the set of solutions (a,b, c,d) of the equation 
ae esr ah = © or Oa bcd 
The free variables are a,c and d. Set 


(@) iol —elpa ce — 104d — 05 (2) 30 ic — le 0 (@) C=O5 0=03 Si 


I 
S 


to obtain the respective solutions 
Ue a 0, 0, 0), vo (0, ie ile 0), OR = (0, sally 0, 1) 


The set {v1, Vo, v3} is a basis of U, and dim U = 38. 


(ii) We seek a basis of the set of solutions (a, b,c, d) of the system 
Omi) a b= 70 
or 
Cae. C= 7G) = 
The free variables are 6 and d._ Set 
(i) bb =i, chs Os (QD) =O, ol Sil 
to obtain the respective solutions 
Vy = (il ib 0, 0), Vo = (0, 0, 2, 1) 


The set {v,, V2} is a basis of W, and dim W = 2. 


(iii) Un W consists of those vectors (a,b, c,d) which satisfy the conditions defining U and the con- 
ditions defining W, i.e. the three equations 


b+e+rd = 0 a+b = 
a+ob = (") or bE ca ad = 0 
Cc = 20. C—12a.— 0 


The free variable is d. Set d =1 to obtain the solution v = (3,—3,2,1). Thus {v} is a basis 
of UNW, and dim(UNW) = 1. 


5.28. Find the dimension of the vector space spanned by: 


172 see! 
ei, —2, 3,1) and (1, 1,—2,°3) (v) ¢ 3) and G 
(ii) (3, —6, 3, —9) and (—2, 4, —2, 6) Ge a 8 
(iii) + 224+ 3t+1 and 2+ 4+ 6t+2 WA aes 3 8 
(iv) @—2?+5 and ?+3t—4 (vii) 3 and —3 


106 


5.29. 


5.30. 


5.31. 


BASIS AND DIMENSION [CHAP. 5 


Two nonzero vectors span a space W of dimension 2 if they are independent, and of dimension 
1 if they are dependent. Recall that two vectors are dependent if and only if one is a multiple of 
the other. Hence: (i) 2, (ii) 1, (iii) 1, (iv) 2, (v) 2, (vi) 1, (vii) 1. 


Let V be the vector space of 2 by 2 symmetric matrices over K. Show that 
dim V = 8. (Recall that A = (ai) is symmetric iff A = A‘ or, equivalently, aij = aj.) 


iy Te 
An arbitrary 2 by 2 symmetric matrix is of the form A = fh if where a,b,cE K. 
(Note that there are three “variables’”.) Setting 


(Gi) a=1,6=0,c=0, (ii) a=0,b=1, c=0, (iii) a= 0, b=0, c= 1 
we obtain the respective matrices 
ih a) (J) al 0 0 
= a | 1 Dy 
sia (« 0) Nek ( 0) : G 1) 


We show that {Z,, E>, E3} is a basis of V, that is, that it (1) generates V and (2) is independent. 


(1) For the above arbitrary matrix A in V, we have 


A = @ Hy = ab, + 6H, + cBs 
Cc 
Thus {H,, F,, E53} generates V. 
(2) Suppose «FE, + yE,+zE3 = 0, where x, y, z are unknown scalars. That is, suppose 


oe Ea co) tel tae eateg be e y O20 

BN Om Oy ge 1 20 Zoey) settee ee ee 

Setting corresponding entries equal to each other, we obtain «=0, y=0, z=0. In other words, 
eH, + yH, + 2H, = 0 implies x =0, y=0, 2, = 0 


Accordingly, {H,,#,,H3} is independent. 
Thus {H,, E,, £3} is a basis of V and so the dimension of V is 3. 


Let V be the space of polynomials in t of degree =n. Show that each of the following 
is a basis of V: 


(Tet rece? cn’, C8 Ste ye oli) t Loadi CCl eee ed) ee tine 
Thus dimV = +1. 


(i) Clearly each polynomial in V is a linear combination of 1,t,...,#"-1 and ¢". Furthermore, 
1,t,...,¢"~1! and t” are independent since none is a linear combination of the preceding poly- 
nomials. Thus {1,t,...,¢”} is a basis of V. 


(ii) (Note that by (i), dimV=n+1; and so any n+1 independent polynomials form a basis of 
V.) Now each polynomial in the sequence 1,1—t, ...,(1—t#)" is of degree higher than the 
preceding ones and so is not a linear combination of the preceding ones. Thus the +1 poly- 
nomials 1,1—#, ...,(1—#)" are independent and so form a basis of V. 


Let V be the vector space of ordered pairs of complex numbers over the real field R 
(see Problem 4.42). Show that V is of dimension 4. 


We claim that the following is a basis of V: 
B = {(1, 0), (2, 0), (0, 1), (0, 4} 
Suppose v€ V. Then v = (z,w) where z,w are complex numbers, and so v = (a+ bi, c+ di) where 
a,b,c,d are real numbers. Then 


v = a(1,0) + dG, 0) + c(0,1) + d(0, 4) 
Thus B generates V. 


CHAP. 5] 


The proof is complete if we show that B is independent. 


ak Gls 0) = Xo(2, 0) + %3(0, 1) + x4(0, 2) 


where %1,%9,%3,%,€R. Then 


BASIS AND DIMENSION 


(x4 aP U5t, x3 =F 42) = (0, 0) and so { 


Suppose 


1 AP Lol = 


bie ae Gane 


Accordingly «x, =0, %2 = 0, ;=0, x,=O0 and so B is independent. 


II 


107 


5.32. Let V be the vector space of m Xn matrices over a field K. Let Eiji € V be the matrix 


with 1 as the ij-entry and 0 elsewhere. 
dim V = mn. 


We need to show that {#;} generates V and is independent. 


Show that {Hi} is a basis of V. Thus 


Let A = (a;;) be any matrix in V. Then A = > a,;H;; Hence {H;;} generates V. 
Ud) 


Now suppose that > %;;E,; = 0 where the x, are scalars. The ij-entry of 3} %;;Hj; is x, and 
Pe , oe ; : ij 
the ij-entry of 0 is 0. Thus %;=0, i=1,...,m, 7=1,...,n. Accordingly the matrices Ej; are 
independent. 


Thus {E£;;} is a basis of V. 


Remark: Viewing a vector in K” as a 1 X n matrix, we have shown by the above result that the 
usual basis defined in Example 5.8, page 88, is a basis of K” and that dim K” = n. 


SUMS AND INTERSECTIONS 


5.33. Suppose U and W are distinct 4-dimensional subspaces of a vector space V of dimen- 


sion 6. Find the possible dimensions of UNW. 


Since U and W are distinct, U+W properly contains U and W; hence dim(U+W) > 4. 
But dim(U+W) cannot be greater than 6, since dimV=6. Hence we have two possibilities: 


(i) dim(U+ W) =5, or (ii) dim(U+W)=6. Using Theorem 5.8 that 
dim W — dim(UNW), we obtain 


(i) 
(ii) 


That is, the dimension of UN W must be either 2 or 3. 


5.34. 


respectively. 


Find 


5 
6 


and 


4+ 4 —dim(UNW) 
4+ 4 —dim(UNW) 


or 


or 


Let U and W be the subspaces of R* generated by 


{(1, 1, 0, —1), (1, 2, 8, 0), (2, 8, 3, -1)} 
(i) dim(U+W), (ii) dim(UNW). 


dim (UN W) 
dim (UN W) 


dim (U+ W) = dimU + 


I| 
bo 


{(1, 2, 2, —2), (2, 8, 2, —8), (1, 8, 4, —3)} 


(i) U+W is the space spanned by all six vectors. Hence form the matrix whose rows are the 


given six vectors, and then row reduce to echelon form: 


1 


BPporHpr 
wo ow nm oO D9 


Since the echelon matrix has three nonzero rows, dim (U + W) = 3. 


0 


BF DOD Ww w 


1 


225 5O 7.0 7S 


1 


on a a 


0 


PF DD LF oO 


1 


SY Ses 5} 


SCOoOOHrReE LH 


rea et eer 
Ofte fee 
Ori Oe ie 
MOET EO 22042 020 
0i0. 0470 
Om 0.00: 60 


108 


5.35. 


BASIS AND DIMENSION 


(ii) 


W respectively and then row reduce each to echelon form: 


= aah Leal 

Len? Peo 0 

PB. 33 al 
and 

1e Wyo) oe 

Pe BO 8! 

Ibe aay ee 3} 


[CHAP. 5 


First find dim U and dim W. Form the two matrices whose rows are the generators of U and 


Dos ogg! j Gees Cae) aks | 
to ‘hae bea S| toi +| OF 2 eased 
0 OT 0s O00 
fie BOS te 2 19, 2 oa? 
to 0 -1'-2 1 to 0-1-2 1 
Or te ae et O>: OF 703.86 


Since each of the echelon matrices has two nonzero rows, dim U=2 and dimW =2. Using 
Theorem 5.8 that dim(U+ W) = dim U + dim W — dim(UNW), we have 


oe 


22 dim (Oia) 


or dim(UNW) = 1 


Let U be the subspace of R® generated by 
(0) 3)22,2..3), (1,45 = 37472) 2, oun de aon 
and let W be the subspace generated by 
(1:3; 04251)0(1, 5, —6, 6, 3), (2, 0,o7en) 
Find a basis and the dimension of (i) U + W, (ii) UNW. 


(i) 


U +- W is the space generated by all six vectors. 


six vectors and then row reduce to echelon form: 


ibe 33 
Ie ¥4 
PW) 
icmayices 
Dass 
Zee) 
DNS 
Oeil: 
to ely 
OO 
Omn0 
0 O 


On anes i 2°9o 3 
Se Sere a Ee es 
PR EES US O53 c= onus 
Oy Dalal acer Ogg (Ole a oarme 
eee s 0. n2e—4 ed 20 
St oma: O07 tit 7g 
Ho) (oh ag 10032 oes 
res) ees eee eR 
OF Own Orla = pO Onna 2 
2 0-2 02 On One CeO 
=29 Qing OF “Ou nO On nO 
6- OTE 0” “02 soso t0 


The set of nonzero rows of the echelon matrix, 
{(1, 3, 745 2, 3), (0, 1, =I, 2, sls) (0, 0, 2, 0, —2)} 
is a basis of U+ W; thus dim(U+ W) =3. 


Hence form the matrix whose rows are the 


First find homogeneous systems whose solution sets are U and W respectively. Form the matrix 


whose first rows are the generators of U and whose last row is (a, y, z, s, t) and then row reduce 


to echelon form: 


3a 2 3 
ae omer one 
2 3-1-2 9 
08 SU a eee Se ee 
1 

to g 

0 

0 


OS ©: = +e 


i 3 —2 2 
0 a 
ws 1 il 2 
0 =8} 3 —=G 
0 —8e+y 2e+2 -2r+s 
=2 2 3 
il 2; all 
—x“x+ryt+z 44—2y+s —6a+y+t 


0 0 0 


OMe a 


CHAP. 5] BASIS AND DIMENSION 109 


a cies epErics of the third row equal to 0 to obtain the homogeneous system whose solution 
ca te ecten 2 9 — ans Aa Oi ase |) MSE Use ie = 0 


Now form the matrix whose first rows are the generators of W and whose last row is 
(x,y,z, 8,t) and then row reduce to echelon form: 


1 On 2 oad 1 3 0 HL 
i) =—Oc0e'On Bro ey 0 2 —6 4 2 
2 yal 0 =Il 3 —2 Sil 
Cn Yee ZN See At ON =8eery Qe Z0 ara) eet 

Taio 0 2 1 

al —3 2 ik 

0 0 -9%+8y+2 44a—Qy+s 2Qe—ytt 

0 0 0 0 0 


Set the entries of the third row equal to 0 to obtain the homogeneous system whose solution 


set is W: 
Giese Sap ee = Os Ne PAN ae SS (Ny PHY Vf te ts = LY 


Combining both systems, we obtain the homogeneous system whose solution set is UN W: 


=i Yew = 0 [—e+ y+ 2 =i 
Ay — 2y ap S = 0 ZY) Az 7s = 0 
=O tat y +t = 0 —OUee Oe +t = 0 
—9e + 8y + 2 = 0 a yee sas = 0 
é 4x — 2y ES =—) (1) 2y 4- 4zZ Es = 0 
24 — ¥ +t= 0 l Ye are =) 
typ ct: 42 0 
Qy+42+ 8 =) Re te as : 
82e+5s+2i = 0 or SS a as 
io ee eo 82 + 6s + 2 = 0 
Rh aero Si te — a0 


There is one free variable, which is t; hence dim(UNW) =1. Setting t= 2, we obtain the 
solution «=1, y=4,2=-—3,s=4,t=2. Thus {(1,4,—3,4,2)} is a basis of UN W. 


COORDINATE VECTORS 
5.36. Find the coordinate vector of v relative to the basis {(1, 1, 1), (1,1, 0), (1, 0, 0)} of R* 
where (i) v = (4, —8, 2), (ii) v = (a, ), ¢). 
In each case set v as a linear combination of the basis vectors using unknown scalars «x, y and 2: 
was ‘x(t, 1) yd, 1; OY +'2@, 0, 0) 
and then solve for the solution vector (x,y,z). (The solution is unique since the basis vectors are 


linearly independent.) 


(i) (4,-8,2) = x(1,1,1) + y@, 1, 0) + 2(1, 0, 0) 
(a, x, x) + (y, y, 0) + (2, 0, 0) 
= («tytz,ct+y, x) 
Set corresponding components equal to each other to obtain the system 
etyte2 = 4, pak pe Shy ie SS 
then put x = 2, y=—5 into 
unique solution to the 


Substitute «= 2 into the second equation to obtain y = —5; 
the first equation to obtain z= 7. Thus «=2, y=—5, 2z=7 is the 
system and so the coordinate vector of v relative to the given basis is [v] = (2,—5,7). 


110 BASIS AND DIMENSION (CHAP. 5 


(ii) (a, b,c)’ = x(1,1,1) + y(1, 1,0) +20, 0,0) = @W@tytz,uty, &) 


Then etyte=a, we«ty=b w=e 
from which x=c, y=b—c,z=a—b. Thus [v] = (c,b—c,a—b), that is, [(a, 6, ¢)] = 
(Gy OO B=) 


5.37. Let V be the vector space of 2 X 2 matrices over R. Find the coordinate vector of the 
matrix A €V relative to the basis 


Leet, 0. <1 1 -1 10 | 2a 
where Age 
fet ay ay ay edt a 
Set A as a linear combination of the matrices in the basis using unknown scalars 4, y, 2, W: 
DB Tt eal 0 -1 1°=2 1 
aie (3 #(; ale A Rohs 4 
te e 4 @) =a Cee w 0 
= (Gera) gin eae 


i. LG ap PaO UY hi =A) = C3 
a Cee ey 


Set corresponding entries equal to each other to obtain the system 

ae ap eS 8 9 = laa eS Bp SSO) = iG = 7 
from which « = —7, y=11, 2 = —21, w= 30. Thus [A] = (—7,11,—21,30). (Note that the co- 
ordinate vector of A must be a vector in R* since dim V = 4.) 


5.38. Let W be the vector space of 2 x 2 symmetric matrices over R. (See Problem 5.29.) 
: : : 4 tel ; ? 
Find the coordinate vector of the matrix A = 11 : relative to the basis 
Wo \ of Dae dy ot 
BO Mebane ete Hh 
Set A as a linear combination of the matrices in the basis using unknown scalars x, y and z: 
41); 1 =e Pe tii Aer ipar yA ae Ce Vie Gi) = oe 
Al ae se = 
ee ») “(5 ) v(; =) es ) ee ee is oe 
Set corresponding entries equal to each other to obtain the equivalent system of linear equations 
and reduce to echelon form: 


ky 2 YAN ae 1 4 

ap ce PA) ae ae Ss 4 : 2 ac 
Vile aS OS 8 

or Osi or BW ar is = =33 
Shae Yi Wy =e Salk 

Nia We = ali 522 seDe 
Aa OU, — Gr) Yl 

We obtain z=1 from the third equation, then y = —2 from the second equation, and then «# = 4 


from the first equation. Thus the solution of the system is «= 4, y= —2,2=1; hence [A] = 
(4,—2,1). (Since dim W=3 by Problem 5.29, the coordinate vector of A must be a vector in R.) 


5.39. Let {é1, €2, es} and {f1, f2, fs} be bases of a vector space V (of dimension 3). Suppose 
€i = aifi + Aofe + asfs 
€2 = Dif: + Dofe + bsfs (1) 
€s = Cifi + Cofe + Cafs 


Let P be the matrix whose rows are the coordinate vectors of é1, é€2 and e3 respectively, 
relative to the basis {fi}: 


CHAP. 5} BASIS AND DIMENSION it 


Qi 2 a3 
| eee bi be b3 
Ci C2 C3 


Show that, for any vector v EV, [v]eP = [v];. That is, multiplying the coordinate 
vector of v relative to the basis {e;} by the matrix P, we obtain the coordinate vector 
of v relative to the basis {fi}. (The matrix P is frequently called the change of basis 
matrix.) 


Suppose v = re, + sey + tes; then [v], = (r,s,t). Using (1), we have 
{ r(ayfy Ue Gof o ete Asf3) SE s(byfy aia bofs a bsfs) ae t(esf, a3 Cofs ap C3f3) 
= (ra,+ sb, + te;)f, + (ray + sbo + tey)fo + (rag +sb5 + tes)fs 


Hence [v], = (ray+ sb, + tey, rag + sby 4+ tes, rag + sbg + tes) 
On the other hand, 
ay Ag ag 
(oer eT = aaa t)| Orbe. bs 
Ay May GS 


= (ra, + 8b; + tey, rag + sb, + teg, rag + sb + tes) 
Accordingly, [v],.P = [v];. 


Remark: In Chapter 8 we shall write coordinate vectors as column vectors rather than row 
vectors. Then, by above, 


OR Oe Ga ra, + sb, + te; 
Qlv. = as by Co Ss = Ta, ae sbo ae téo = [v]; 
@z 103. Cg] \t Taz + sbz + tes 


where Q is the matrix whose columns are the coordinate vectors of ¢,, €, and e3 respectively, relative 
to the basis {f;}. Note that Q is the transpose of P and that Q appears on the left of the column 
vector [v]|, whereas P appears on the right of the row vector [v|,. 


RANK OF A MATRIX 
5.40. Find the rank of the matrix A where: 


Tame cone hee ae Pi2a-3 1 3 
ee sess 12) we a [ eo 
a) he oie ed RPE ete) =a —2 3 


(i) Row reduce to echelon form: 


Te BE ab ee 83 MO ey AL SP 8 The 8835 a hs ey 
fA Siti) = 4: ORelS 2 22 1 Oe Slee meee oll 
to to 
ieee Sec A ee 0-—3) —6 =—3 3 On ROt sO tO 
Or te lee iO Oe 2) SP Al 0-05" OF 07 0 


Since the echelon matrix has two nonzero rows, rank (A) = 2. 


(ii) Since row rank equals column rank, it is easier to form the transpose of A and then row 
reduce to echelon form: 


dian 8 1 Pe are all Wo 4 ry 
Gre al ah Se! to Of) eon eo to OSB OE 
—oF Oe homeo Oman OtL— os <0 OM Ones ses 


Thus rank (A) = 3. 


112 


5.41. 


5.42. 


5.43. 


BASIS AND DIMENSION [CHAP. 5 


(iii) The two columns are linearly independent since one is not a multiple of the other. Hence 
rank (A) = 2. 


Let A and B be arbitrary matrices for which the product AB is defined. Show that 
rank (AB) =rank(B) and rank (AB) = rank (A). 
By Problem 4.33, page 80, the row space of AB is contained in the row space of B; hence 


rank (AB) = rank (B). Furthermore, by Problem 4.71, page 84, the column space of AB is contained 
in the column space of A; hence rank (AB) = rank (A). 


Let A be an n-square matrix. Show that A is invertible if and only if rank (A) =n. 


Note that the rows of the n-square identity matrix J, are linearly independent since J, is in 
echelon form; hence rank (I,) =n. Now if A is invertible then, by Problem 3.36, page 57, A is row 
equivalent to J,,; hence rank(A) =. But if A is not invertible then A is row equivalent to a matrix 
with a zero row; hence rank(A) < 7”. That is, A is invertible if and only if rank (A) =n. 


Let «i,, vi,,..., 2, be the free variables of a homogeneous system of linear equations 
with n unknowns. Let v; be the solution for which: 2;,=1, and all other free varia- 
bles = 0. Show that the solutions 21, v2,..., vx are linearly independent. 


Let A be the matrix whose rows are the v; respectively. We interchange column 1 and column 


i,, then column 2 and column ig, ..., and then column k and column %; and obtain the k X n matrix 
De Oe20 aeelie 0 O Ci k+1 arene Cin 
J Siete (Ue ON 3 ia 0 0 0 Cy K+ Con 
0: 0. 0 (Oa sia Ckyk+1 Chen 


The above matrix B is in echelon form and so its rows are independent; hence rank(B)=k. Since 
A and B are column equivalent, they have the same rank, ie. rank(A)=k. But A has k rows; 
hence these rows, i.e. the v;, are linearly independent as claimed. 


MISCELLANEOUS PROBLEMS 


5.44, 


The concept of linear dependence is extended to every set of vectors, finite or infinite, 
as follows: the set of vectors A = {vi} is linearly dependent iff there exist vectors 
Vi», ...,Vi, A and scalars adi,...,dn © K, not all of them 0, such that 


QU Gap ch ae dale ea 


Otherwise A is said to be linearly independent. Suppose that Ai, A»s,... are linearly 
independent sets of vectors, and that AiCAsC---. Show that the union A = 
A,UA2U::- is also linearly independent. 
Suppose A is linearly dependent. Then there exist vectors v;,...,v,€A and scalars Osha ck 
a, © K, not all of them 0, such that 
OiUin ain lo Up aia ies en One — (1) 


Since A = UA; and the v;€A, there exist sets A;,..., Ai, such that 


iy? 
OE Aip Vo.E Ai, feo iS Ai, 
Let k be the maximum index of the sets Aj: k = max (iy, ...,%,). It follows then, since Ay C Ag C---, 
that each A; is contained in A,. Hence 2 ,v»,...,0, € A, and so, by (1), A, is linearly dependent, 
which contradicts our hypothesis. Thus A is linearly independent. 


CHAP. 5} BASIS AND DIMENSION 113 


5.45. 


5.46. 


5.47. 


5.48. 


Consider a finite sequence of vectors S = {v, v2, .. .,Un}. Let T be the sequence of 
vectors obtained from S by one of the following ‘elementary operations”: (i) inter- 
change two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of 
one vector to another. Show that S and T generate the same space W. Also show 
that T is independent if and only if S is independent. é 


Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On 
the other hand, each operation has an inverse of the same type (Prove!); hence the vectors in S are 
linear combinations of vectors in T. Thus S and T generate the same space W. Also, T is inde- 
pendent if and only if dim W =n, and this is true iff S is also independent. 


Let A = (aij) and B= (bi) be row equivalent m Xn matrices over a field K, and let 
V1, ..+,Un be any vectors in a vector space V over K. Let 


Uy = Qi1V1 + Q12V2 + --- + AinVn a = bi1V1 + Dy2V2 dE 5 So SE DinVn 
Uz = AeV1 + A222 + +++ + AonUn We = beaiv1 + Dove + -++ + DonUn 
Um = AmiV1 + Am2V2 + +++ + AmnVn Wn = DmiV1 ap Dm2V2 Stee ueee ole Orinda 


Show that {wi} and {wi} generate the same space. 


Applying an “elementary operation” of the preceding problem to {u;} is equivalent to applying 
an elementary row operation to the matrix A. Since A and B are row equivalent, B can be obtained 
from A by a sequence of elementary row operations; hence {w,} can be obtained from {u;} by the 
corresponding sequence of operations. Accordingly, {w;} and {w,} generate the same space. 


Let v1, ...,Un belong to a vector space V over a field K. Let 


Wy = Oii101 + Qi2v2 + °-* F GinUn 
We = A21V1 + Az2V2 + -** + AonVn 
Wn a Onli as On202 ae a at Onn 


where ai; © K. Let P be the n-square matrix of coefficients, i.e. let P = (aij). 


(i) Suppose P is invertible. Show that {wi} and {vi} generate the same space; hence 
{wi} is independent if and only if {vi} is independent. 


(ii) Suppose P is not invertible. Show that {wi} is dependent. 
(iii) Suppose {w;} is independent. Show that P is invertible. 


(i) Since P is invertible, it is row equivalent to the identity matrix I. Hence by the preceding 
problem {w;} and {v;} generate the same space. Thus one is independent if and only if the 
other is. 


(ii) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means that 
{w;} generates a space which has a generating set of less than n elements. Thus {w;} is 


dependent. 


(iii) This is the contrapositive of the statement of (ii), and so it follows from (ii). 


Suppose V is the direct sum of its subspaces U and W, ie. V = U®W. Show that: 
(i) if {t1,...,%m}CU and {w1,.. .,Wn}CW are independent, then {wi,wj} is also 


independent; (ii) dimV = dimU + dimW. 


(i) Suppose ayuy + +++ + AmUm + byw, +--+: + b,w, = 0, where ai, b; are scalars. Then 


0= (ayy + 26+ + dtm) + (byw1 + +++ + Onn) = Marl 


114 


5.49. 


5.90. 


5.51. 


BASIS AND DIMENSION [CHAP. 5 


where 0, au, +°*++a4nUm€U and 0, bwy++-:+b,w, EW. Since such a sum for 0 is 
unique, this leads to 
GU a seai Onl, ek—O OM seo Mails == (V 


The independence of the wu; implies that the a; are all 0, and the independence of the w; implies 
that the 6; are all 0. Consequently, {u;,w;} is independent. 


(ii) Method 1. Since V=U@W, we have V=U+W and UnNW = {0}. Thus, by Theorem 


5.8, page 90, 
dimV = dimU + dimW — dim(UNW) = dimU + dimW —0 = dimU + dimW 
Method 2. Suppose {u,,...,u,} and {w,,...,w,} are bases of U and W respectively. Since 


they generate U and W respectively, {u;,w;} generates V = U-+W. On the other hand, by 
(i), {u;, w;} is independent. Thus {u;,w;} is a basis of V; hence dimV = dim U + dim W. 


Let U be a subspace of a vector space V of finite dimension. Show that there exists 
a subspace W of V such that V = U®W. 


Let {u,,...,u,} be a basis of U. Since {u;} is linearly independent, it can be extended to a 
basis of V, say, {uy,...,U,, Wy,.-.,Ws}. Let W be the space generated by {w,,...,w,}. Since 
{u;,w;} generates V, V= U+W. On the other hand, Un W = {0} (Problem 5.62). Accordingly, 
V=U®@QW. 


Recall (page 65) that if K is a subfield of a field HE (or: E' is an extension of KA), then 
E' may be viewed as a vector space over K. (i) Show that the complex field C is a 
vector space of dimension 2 over the real field R. (ii) Show that the real field R is a 
vector space of infinite dimension over the rational field Q. 


(i) We claim that {1,7} is a basis of C over R. For if v€C, thn v= a+b =ae1+be2 
where a,b€R; that is, {1,7} generates C over R. Furthermore, if x*1+y*i=0 or 
“2+yi= 0, where «,y€R, then «=0 and y=0; that is, {1,7} is linearly independent 
over R. Thus {1,7} is a basis of C over R, and so C is of dimension 2 over R. 


(ii) We claim that, for any n, {1,7,72,...,7"} is linearly independent over Q. For suppose 
Ggl + ayr + dor? + +--+ + Gn7” = 0, where the a;©Q, and not all the a; are 0. Then z is a 
root of the following nonzero polynomial over Q: ap + ayx + dow? + +++ + a,x" But it can be 
shown that 7 is a transcendental number, i.e. that z is not a root of any nonzero polynomial 
over Q. Accordingly, the +1 real numbers 1,7z,7?2,...,7” are linearly independent over Q. 
Thus for any finite , R cannot be of dimension n over Q, i.e. R is of infinite dimension over Q. 


Let K be a subfield of a field L and L a subfield of a field H: KC LCE. (Hence K isa 
subfield of H.) Suppose that E' is of dimension n over L and L is of dimension m 
over K. Show that E is of dimension mn over K. 

Suppose {vj,...,U,} is a basis of EF over L and {a;,...,@m} is a basis of L over K. We claim 
that {av;: 1=1,...,m,j7=1,...,n} is a basis of E over K. Note that {a,v;} contains mn 
elements. 


Let w be any arbitrary element in #. Since {v,,...,v,} generates FE over L, w is a linear com- 
bination of the v; with coefficients in L: 


w = by; + byvg + +++ + bn” 6, ee (1) 


n? 


Since {a,,...,Um} generates L over K, each b;€L is a linear combination of the a; with co- 


efficients in K: 
by = kyyay + byodg + +++ + ky Qn 


= koi, SF KoA Se OS KomQm 


o 
i) 
| 


eee ere eee sree eee ere ese seers eenseee 


b, = Keni + ky2ds Sauer ata [eer ah is 


CHAP. 5} BASIS AND DIMENSION 115 


where k;;€ K. Substituting in (1), we obtain 

Ww (hy04 + + KimOm)?1 + (kya, + +++ + Kam&m)V2 + +** + (Kyyay + oes + Knm&m)Vn 
keyyQy0y Hore + Key Gm + kegyayvg + +> + Kom@mVe + 2° * + Kny@y¥n + °* + Keni Gn 
= kji(a,v;) 


where k;;© K. Thus w is a linear combination of the a,v; with coefficients in K; hence {a;v;} gen- 
erates E' over K. 


l| 


lI 


The proof is complete if we show that {a;v;} is linearly independent over K. Suppose, for scalars 
%,EK, & xj(a,v;) = 0; that is, 
39 


(04 404V1 + %yoGQ¥, + °° * + Lym GmVz) bee + (epAtD, 1 Cylon + tt eed, v,) = 0 
or (4 4y + %yQQq + +++ + Hy mGm)Vy + 22+ + (yd + Lyodlg tes + Cee) en) 
Since {vj,...,¥,} is linearly independent over L and since the above coefficients of the v; belong to 
L, each coefficient must be 0: 
Ri Oy Myo te A yy = 0, 25 2) yy EW ppds +"? + et, = 0 
But {a;,...,@} is linearly independent over K; hence since the v4 € K, 
sag) SAU ity Uno Sa orca ON adoration Al ha mo pre (Nn Mein eran eal 


Accordingly, {a;v;} is linearly independent over K and the theorem is proved. 


Supplementary Problems 


LINEAR DEPENDENCE 


5.52. 


5.54, 


5.55. 


Determine whether uw and v are linearly dependent where: 


(G)- «w= 1, 2,3; 4), v = @, 3,2, 1) (iti) (Ove) hes (OR — 3) 
(ii) «= (—1, 6, -12), v = (4, —3, 6) Gain eas (Ae 0,10) yee (0. ena 

AD 2, onal i ie 1) WW) ak 
(v) w= (5 Beer . (vi) w= ({ eee ee el 
(vii) u=—-8+412-16, v=18-1e+8 (vill) w= 84+3t+4, v= B+4t4+3 


Determine whether the following vectors in R4 are linearly dependent or independent: (i) (1,3, —1, 4), 
(3, 8, =5, 7), (2, 9, 4, 23); (ii) (1, 2s 4, ine (2; ds 0, =o) 5 (3, S05 iL; 4). 


Let V be the vector space of 2 X 3 matrices over R. Determine whether the matrices A,B,C EV 
are linearly dependent or independent where: 


12s 3 1-1 4 — (3 -8 é 
i — = ’ Eo = 
Oe e 4 aut x 6 5-2 Bie 1 
2 1-1 Re a 4-1 a) 
== G = 
\ —2 me - ee 0 we fa —2 -8 


Let V be the vector space of polynomials of degree = 3 over R. Determine whether u,v,w€V are 
linearly dependent or independent where: 


GG) a= 8 — 477 263, v= + 24+ 4t-—1, w = 28—#-—3t+5 


II 


(ii) A 


(ii) wo = 8 —52—2t+3, v = 8&—42-3¢+4, w = 2-—T2—1t+9 


5.57, 


5.58. 


5.59. 


5.60. 


5.61. 


BASIS AND DIMENSION [CHAP. 5 


Let V be the vector space of functions from R into R. Show that f,g,h©€V are linearly independ- 
ent where: (i) f(t) = et, g(t)=sint, h(t) =; (ii) f() =—et, g(t) =e, h(t) =t; (iii) fO =e, 
gtt) = sin t, h(t) =cos t. 

Show that: (i) the vectors (1—7, 7) and (2, —1+7) in C2 are linearly dependent over the complex 
field C but are linearly independent over the real field R; (ii) the vectors (34+ v2, 1+y72) and 


(7,1 +22) in R2 are linearly dependent over the real field R but are linearly independent over the 
rational field Q. 


Suppose wu, v and w are linearly independent vectors. Show that: 
(i) uw+tv—2w, uw—v—w and u+w are linearly independent; 


(ii) utv—3w, u+38v—w and v+w are linearly dependent. 


Prove or show a counterexample: If the nonzero vectors u, v and w are linearly dependent, then w 
is a linear combination of uw and v. 


Suppose v1, V5, ...,V,, are linearly independent vectors. Prove the following: 

(i)  {@,Vy, GgVo, ..., GxVn} is linearly independent where each a; ~ 0. 

(li) {v4,...,V;-4, W, Vi44, ---) Vn} is linearly independent where w = 6,v,; + --- + 6,4 --- + byvy 
and 6, #0. 


Let v =(a,b) and w= (c,d) belong to K2. Show that {v,w} is linearly dependent if and only if 
ad—be = 0. 


Suppose {2,,...,U,,Wy,..-,Ws} is a linearly independent subset of a vector space V. Show that 
L(u;) 0 L(w;) = {0}. (Recall that L(u;) is the linear span, i.e. the space generated by the w;.) 


Suppose (@44, ..-, Qn), ---» (Ami, ---» Gmn) are linearly independent vectors in K™, and suppose 
V4, ..-,V, are linearly independent vectors in a vector space V over K. Show that the vectors 
5 GUT hee ate Uns det Oe Cainer aCe 


are also linearly independent. 


BASIS AND DIMENSION 


5.64. 


5.67. 


5.68. 


Determine whether or not each of the following forms a basis of R?2: 
Gee Geel) Bandis(sse) (iii) (0,1) and (0, —8) 
(int) (2, a, (I 3) eval (@, ») (iv) (2,1) and (—8, 87) 


Determine whether or not each of the following forms a basis of R3: 
Gi) (lg 2 1D) aunt (Op Si, 1) 

(ii) (2, 4, —8), (0,1, 1) and (0, 1, —1) 

(ili) (1, 55'—6), \(2, 1, 8),0(8,-—4, 4) and (2, 111) 

(iv) (1, 38, —4), (1, 4, —3) and (2, 3, —11) 


Find a basis and the dimension of the subspace W of R* generated by: 
(i) (1, 4, =Al. 3), (2, ib =a ==) and (0, 2, Ib —d) 
(ii) Ge —4, =P, 1); (Ue a, =il, 2) and (3, =) ey 7) 


Let V be the space of 2 x 2 matrices over R and let W be the subspace generated by 
il = Al tl 1b Sy 
ee :) a 5) ie 7) obs a %) 
Find a basis and the dimension of W. 


Let W be the space generated by the polynomials 
“w= B+ 2A 2641, vv = 8432 t+ 4 and ww = 08 + 2 = Wo 


Find a basis and the dimension of W. 


CHAP. 5] BASIS AND DIMENSION ibe Af 


5.69. 


5.70. 


5.71, 


5.72. 


Find a basis and the dimension of the solution space W of each homogeneous system: 


Let tOUNa toe — a0 BB OUD oy gg 

Set Dip = 
oa Si SP eee = (0 26.1) 3Y — 22) = si wae ¢ 
384 + by + 8 = 0 Qe-— y+ 2=0 eee AR ae 


(@) (ii) (iii) 


Find a basis and the dimension of the solution space W of each homogeneous system: 


Be me PAN) ee SEW me 1) Leo OSE Aa) 
Rh SS VAY eo) 2 SO RG ON ss XT Py, op AV Vig > Bae lity =" (0) 
DU Aye eit esi ta —4 0 Pope LY ay gas ANS —— Pp ea (I) 


(i) (ii) 


Find a homogeneous system whose solution set W is generated by 
{(1, 2 0, 3, ek) (2, wath) 2, 5, =), (1, any ib 2, —2)} 


Let V and W be the following subspaces of Ri: 
VI AGO, Cd) cb — Pead = 0") “2 Wo=iA(aybyend) a= id, 6b =a2e) 
Find a basis and the dimension of (i) V, (ii) W, (iii) VAW. 


Let V be the vector space of polynomials in t of degree =n. Determine whether or not each of the 
following is a basis of V: 


(i) (1,144, 14+t+2, 1+t+24+8, ...,14t+24---+e-14 0} 
(ii) {(L+t, t+2, 248, ..., m-2+ 0-1, m-14 tr}, 


SUMS AND INTERSECTIONS 


5.74, 


5.75. 


5.76. 


5.77. 


5.78. 


5.79. 


Suppose U and W are 2-dimensional subspaces of R3. Show that UNW # {0}. 


Suppose U and W are subspaces of V and that dim U =4, dimW=5 and dimV=7. Find the 
possible dimensions of UN W. 


Let U and W be subspaces of R? for which dimU=1, dimW=2 and U¢ W. Show that 
R= UO W. 


Let U be the subspace of R°® generated by 
{(1, 3, =3 ale =H), (iy, 4, aly ie =2)), (2, 2} 0, =o), 2) 
and let W be the subspace generated by 
{(1, 6, 2, 7p 3), (2, 8, als =6; =), ( 3, Salt, =); —6)} 
Find (i) dim(U+W), (ii) dim(UNW). 


Let V be the vector space of polynomials over R. Let U and W be the subspaces generated by 
{t8+ 42—t+8, 84+ 52+5, 3+ 102—5t+5} and {t8+ 422+ 6, + 20 —t+5, 23 + 24% — 3t+ 9} 


respectively. Find (i) dim(U + W), (ii) dim (UNW). 


Let U be the subspace of R® generated by 
f(1, —1, —1, —2, 0), i, --2;—2, 0, —8), (1, —1, —2, -2, 1)} 


and let W be the subspace generated by 
{(1, —2, —3, 0, —2), (1, —1, —3, 2, —4), (1, —1, —2, 2, —5)} 


(i) Find two homogeneous systems whose solution spaces are U and W, respectively. 


(ii) Find a basis and the dimension of Un W. 


118 BASIS AND DIMENSION [CHAP. 5 


COORDINATE VECTORS 


5.80. | Consider the following basis of R2: {(2,1), (1,—1)}. Find the coordinate vector of v © R? relative 
to the above basis where: (i) v = (2,3); (ii) v = (4,—1), (iii) (8, -3); (iv) v = (a, b). 


5.81. In the vector space V of polynomials in t of degree = 3, consider the following basis: fle (lot) es 


(1 — t)3}. Find the coordinate vector of v € V relative to the above basis if: (i) v= 2—3t+ t2 + 263; 
(ii) v= 3—2t—#; (iii) v=at+bt+ ct? + dé. 


5.82. In the vector space W of 2 X 2 symmetric matrices over R, consider the following basis: 


(Gamage 


Find the coordinate vector of the matrix A €W relative to the above basis if: 


- ong 
(i) a=(5 4) (ii) a=(; 1) 


5.83. Consider the following two bases of R?: 
fe, = (1,1, 1), €.= (0,2, 8), es = (0,2,—1)} and. _.{f, = (1,1, 0), fe = (L,—1,0), fg —.0,0, 0} 
(i) Find the coordinate vector of v = (3,5,—2) relative to each basis: [v], and [v];. 
(ii) Find the matrix P whose rows are respectively the coordinate vectors of the e; relative to the 
basis {f1, fo, fs}- 
(iii) Verify that [v],P = |v}, 
5.84. Suppose {e,,...,e,} and {fy,...,f,} are bases of a vector space V (of dimension n). Let P be the 


matrix whose rows are respectively the coordinate vectors of the e’s relative to the basis {f;}. Prove 
that for any vector v EV, [v],P = [vl]; (This result is proved in Problem 5.39 in the case n = 3.) 


5.85. Show that the coordinate vector of 0€V _ relative to any basis of V is always the zero n-tuple 
(OO ers 0))s 


RANK OF A MATRIX 
5.86. Find the rank of each matrix: 


IP a Sieh elias 8} 8} pee ey 7m a 
ee er a oar aD Ik Gy \e=72 One ANS A et $i 
Lae MaDe AES Bef he ill Dey al = Ouaae 
Cn Om Ome eee ee OOo ee By =! 


5.87. Let A and B be arbitrary m Xn matrices. Show that rank(A + B) = rank (A) + rank (B). 


5.88. Give examples of 2 X 2 matrices A and B such that: 
(i) rank (A +B) < rank (A), rank (B) (ii) rank (A + B) = rank(A) = rank (B) 
(iii) rank (A + B) > rank (A), rank (B) 


MISCELLANEOUS PROBLEMS 
5.89. Let W be the vector space of 3 X 3 symmetric matrices over K. Show that dimW=6 by ex- 
hibiting a basis of W. (Recall that A = (a;;) is symmetric iff Aj = Oj.) 


5.90. Let W be the vector space of 3X8 antisymmetric matrices over K. Show that dimW = 3 by 
exhibiting a basis of W. (Recall that A = (a,;) is antisymmetric iff Ai = —0,-) 


5.91. Suppose dim V =n. Show that a generating set with n elements is a basis. (Compare with Theorem 
5.6(iii), page 89). 


CHAP. 5] BASIS AND DIMENSION 119 


5.92. Let t,,t.,..., t, be symbols, and let K be any field. Let V be the set of expressions a,t, + at) + 
“** +Gnt, where a;€ K. Define addition in V by 


(ayt, 1 Agts sists) ar Ant) cia (byt, Sr bots reeset a) ate brtn) 
= (a4 + by)ty + (ay + dg)ty + +++ + (an + dy)ty 
Define scalar multiplication on V by 
k (at, + Ato + owns + Ant) = ka,t, -+- kayts Si ca + kantn 


Show that V is a vector space over K with the above operations. Also show that {t,,...,t¢,} is a 
basis of V where, for i=1,...,n, 


t; = Ot, + ay ay OG =e te Lt Ota se 2 + Ot, 


5.93. Let V be a vector space of dimension n over a field K, and let K be a vector space of dimension m 
over a subfield F. (Hence V may also be viewed as a vector space over the subfield F.) Prove that 
the dimension of V over F is mn. 


5.94. Let U and W be vector spaces over the same field K, and let V be the external direct sum of U and 
W (see Problem 4.45). Let U and W be the subspaces of V defined by 0 = {(u, 0): we U} and 
W = {(0,w): we W}. 
° . . . A 
(i) Show that ae is isomorphic to U under the correspondence u< (u, 0), and that W is iso- 
morphic to W under the correspondence w < (0, w). 


(ii) Show that dimV = dimU + dimW. 


5.95. Suppose V = U@G W. Let , be the external direct product of U and W. Show that V is isomorphic 
A 
to V under the correspondence v =u+w <> (u,w). 


- Answers to Supplementary Problems 
5.52. (i) no, (ii) yes, (iii) yes, (iv) no, (v) yes, (vi) no, (vii) yes, (viii) no. 
5.53. (i) dependent, (ii) independent. 
5.54. (i) dependent, (ii) independent. 
5.55. (i) independent, (ii) dependent. 
557. (i) (2,-1+4)=(1+090—49; Gi) (7,14+2V2) = (8—V2)(84+ V2,1+ V2). 


5.59. The statement is false. Counterexample: u = (1,0), v = (2,0) and w= (1,1) in R%. Lemma 5.2 


requires that one of the nonzero vectors u,v,w is a linear combination of the preceding ones. In 


this case, v = 2u. 
5.64. (i) yes, (ii) no, (iii) no, (iv) yes. 
5.65. (i) no, (ii) yes, (iii) no, (iv) no. 
5.66. (i) dim W =8, (ii) dim W = 2. 
5.67. dim W = 2. 
5.68. dim W = 2. 
5.69. — (i) basis, {(7, —1, —2)}; dim W = 1. (ii) dimW=0. (iii) basis, {(18, —1, —7)}; dim W = 1. 


5.70. (i) basis, {(2, —1,0, 0, 0), (4, 0, inh —1, 0), ey, 0, i 0, 1)}; dim W = 8. 
(ii) basis, {(2, il; 0, 0, 0), CL, 0, 1, 0, 0)}; dim W = 2. 


120 BASIS AND DIMENSION [CHAP. 5 


BY ap Wi eG = 1) 
5.71. 
Bap De os MY 


5.72. (i) basis, {(1,0, 0,0), (0, 2,1, 0), (0, —1,0,1)}; dim V = 3. 
(ii) basis, {(1,0, 0,1), (0,2,1,0)}; dim W = 2. 


(iii) basis, {(0,2,1,0)}; dim(VNW) =1. Hint. VOW must satisfy all three conditions on a, b,c 
and d. 


5.73. (i) yes, (ii) no. For dimV=x-+1, but the set contains only 1 elements. 
575. dim(UAW) = 2, 3 or 4. 
ht. dim(U + W)=8, dim(UnW) =2; 


5.78. dim(U+W) =3, dim(UNW) =1. 


5.79 (i) 30 4y — 2 Sy = Ax + 2y = 5 = 
: Aa + Qy +s =" 9a +2Qy+z st 


(ii) {(1, —2, —5, 0, 0), (0, 0, 1, 0, —1)}..dim (UN W) = 2. 


oOo > 


5.80. (i) [v] = (5/3, —4/3), (ii) [v] = (1, 2), (iii) [v7] = (0, 8), (iv) [ov] = (a + 6)/8, (a — 26)/3). 
5.81. (i) [vo] = (2,—-5, 7,2), (ii) [vo] = (0, 4,—-1, 0), (iii) [vo] = (a tb+e4+d, —b—2¢— 8d, ¢+ 3d, —d). 


5.82. (i) [A] =(2,—1, 1), (ii) [A] = (8, 1, —2). 


derasOcw 
5.83) (i) [ule = (3, -1, 2), [oly = 4, -1,-2); Gi) P={1 +1 3 
1 °=1 = 


5.86. (i) 8, (ii) 2, (iii) 8, (iv) 2. 


Rie ees yal Ea ORS aes eed) Le 700 
5.88. (i) >A Gj o)? He (ig 7) Gel 0)? Be . 


I POs SO 1 =O OO Oa On OnmnO On as0 Oo Oma0 
5.89 Oa Osa Oule 0 Ao Olen OS. Ose oO perleeneat Ole Or One taills 
Op 0 a0 OR Ok 20) IE WOE) Ue OY — © Ore OF Olea 
Opes £0 Or Ox Al OnO 
5.90 =e ORs On , 0 0 
Osa O a0, oe OheatO) Ural 


5.93. Hint. The proof is identical to that given in Problem 5.48, page 113, for a special case (when V is 
an extension field of K). 


Chapter 6 


Linear Mappings 


MAPPINGS 


Let A and B be arbitrary sets. Suppose to each a € A there is assigned a unique ele- 
ment of B; the collection, f, of such assignments is called a function or mapping (or: map) 


from A into B, and is written ss 
(aA SB or. A>B 


We write f(a), read “f of a”, for the element of B that f assigns to a € A; it is called the 
value of f at a or the image of a under f. If A’ is any subset of A, then f(A’) denotes the set 
of images of elements of A’; and if B’ is any subset of B, then f~1(B’) denotes the set of 
elements of A each of whose image lies in B’: 


Age — et O)ad-eAe —and jas) = 40- era 7 (a) tebe 


We call f(A’) the image of A’ and f~1(B’) the inverse image or preimage of B’. In particular, 
the set of all images, i.e. f(A), is called the image (or: range) of f. Furthermore, A is called 
the domain of the mapping f:A-B, and B is called its co-domain. 


To each mapping f:4—>B_ there corresponds the subset of A XB given by 
{(a, f(a)): a € A}. We call this set the graph of f. Two mappings f:A>B and g:A>B 
are defined to be equal, written f=g, if f(a)= g(a) for every a€A, that is, if they have 
the same graph. Thus we do not distinguish between a function and its graph. The nega- 
tion of f=g is written f~g and is the statement: there exists an a@A for which 


f(a) ~ g(a). 


Example 6.1: Let A = {a,b,c,d} and B= {a,y,z,w}. The following diagram defines a mapping 
f from A into B: 
—/ 
Z| 
Fa 
Here f(a) =y, f(b)=%, f(c)=z, and f(d)=y. Also, 
f({a, b, d}) = {f(a), f(b), f(d)} = {y,%, y} = tx, y} 
The image (or: range) of f is the set {x,y,z}: f(A) = {a, y, 2}. 
Example 6.2: Let f:R>R be the mapping which assigns to each real number x its square 27: 


eh x or f(x) = x? 


Here the image of —3 is 9 so we may write f(—8) = 9. 


121 


122 LINEAR MAPPINGS [CHAP. 6 


We use the arrow - to denote the image of an arbitrary element x € A under a mapping 
f:A->B by writing 


a +> f(x) 
as illustrated in the preceding example. 
: 3 SB) : 3 
Example 6.3: Consider the 2 X 8 matrix A = op Brae If we write the vectors in R? and 
R2 as column vectors, then A determines the mapping T:R*?—>R? defined by 
vP Av, thatis, T(v) = Av, v © RS 
: afte || _ (720 
Thus if v = ; , then T(v) = Av = ay : = ( oe 


Remark: Every m Xn matrix A over a field K determines the mapping T: K*™> K™ 


defined by dee 


where the vectors in K” and K™ are written as column vectors. For convenience 
we shall usually denote the above mapping by A, the same symbol used for the 
matrix. 


Example 6.4: Let V be the vector space of polynomials in the variable t over the real field R. 
Then the derivative-defines a mapping D:V—V where, for any polynomial f€V, 
we let D(f) = df/dt. For example, D(8t2—5t+2) = 6¢— 5. 


Example 6.5: Let V be the vector space of polynomials in ¢t over R (as in the preceding example). 
Then the integral from, say, 0 to 1 defines a mapping g:V—>R where, for any 


1 
polynomial fE€V, we let YJ(f) = f f(t) dt. For example, 
0 


1 
G(st7— bea) f (32—5t+2)de = Ff 
0 
Note that this map is from the vector space V into the scalar field R, whereas the 
map in the preceding example is from V into itself. 
Example 6.6: Consider two mappings f:4—>B and g:B-C illustrated below: 
f 
Ormerod © aaa 
Let a€ A; then f(a) € B, the domain of g. Hence we can obtain the image of f(a) 
under the mapping g, that is, g(f(a)). This map 


ab g(f(a)) 


from A into C is called the composition or product of f and g, and is denoted by 
g°f. In other words, (g°f):A->C is the mapping defined by 


(g°f)(a) = g(f(a)) 
Our first theorem tells us that composition of mappings satisfies the associative law. 
Theorem 6.1: Let f:A>B, g9:B>C and h:C>D. Then ho(gof) = (hog)of. 
We prove this theorem now. If a € A, then 
(ho(gof))(a) = R(gef)(a)) = h(g(f(a))) 
and ((hog)ef)(a) = (hog)(f(a)) = h(g(F(@))) 
Thus (ho(gef))(a) = ((hog)of)(a) for every a€ A, and so ho(gof) = (hog) of. 


Remark: Let F:A->B8B. Some texts write aF instead of F(a) for the image of ac A 
under F’. With this notation, the composition of functions F:4—>B and 
G:B->C is denoted by FoG and not by GoF as used in this text. 


CHAP. 6] LINEAR MAPPINGS 123 


We next introduce some special types of mappings. 


Definition: A mapping f:A->B is said to be one-to-one (or one-one or 1-1) or injective 
if different elements of A have distinct images; that is, 
if aAa’ implies f(a) # f(a’) 


or, equivalently, if f(a)=f(a’) implies a=a’ 


Definition: A mapping f:4->B is said to be onto {or: f maps A onto B) or surjective if 
every b €B is the image of at least one aE A. 


A mapping which is both one-one and onto is said to be bijective. 


Example 6.7: Let f:R>R, g:R->R and h:R->R be defined by f(x) = 2%, g(x) =a3—2 and 
h(x) = «2. The graphs of these mappings follow: 


f(e) = 22 (MON = Bp ae h(x) = x2 


The mapping f is one-one; geometrically, this means that each horizontal line does 
not contain more than one point of f. The mapping g is onto; geometrically, this 
means that each horizontal line contains at least one point of g. The mapping h 
is neither one-one nor onto; for example, 2 and —2 have the same image 4, and —16 
is not the image of any element of R. 


Example 6.8: Let A be any set. The mapping f:A-—>A defined by f(a) =a, i.e. which assigns 
to each element in A itself, is called the identity mapping on A and is denoted by 
Teor) Lor I: 
Example 6.9: Let f:4>B. Wecall g:B—>A the inverse of f, written f—', if 
f°g = 1, and GfK 


We emphasize that f has an inverse if and only if f is both one-to-one and onto 
(Problem 6.9). Also, if 6€B then f—'(b) =a where a is the unique element of A 


for which f(a) = bd. 


LINEAR MAPPINGS 


Let V and U be vector spaces over the same field K. A mapping F':V->U is called a 
linear mapping (or linear transformation or vector space homomorphism) if it satisfies the 


following two conditions: 

(1) For any v,weV, F(vt+w) = Flv) + F(w). 

(2) For any kEK andany ve V, F(kv) = kF(v). 
In other words, F:V->U is linear if it “preserves” the two basic operations of a vector 
space, that of vector addition and that of scalar multiplication. 


Substituting k= 0 into (2) we obtain F(0) = 0. That is, every linear mapping takes 
the zero vector into the zero vector. 


124 LINEAR MAPPINGS [CHAP. 6 


Now for any scalars a,b © K and any vectors v,w€V_ we obtain, by applying both 

conditions of linearity, 
F(av+bw) = F(av) + F(bw) = aF(v) + bF(w) 
More generally, for any scalars a;€K and any vectors v;i@V_ we obtain the basic 
property of linear mappings: 
F(ayv1 + Q2v2 + +++ +QnVn) = AiF' (v1) + Gof (v2). + +++ + OnE (vn) 

We remark that the condition F(av+bw) = aF(v) + bF(w) completely characterizes 

linear mappings and is sometimes used as its definition. 
Example 6.10: Let A be any m Xn matrix over a field K.. As noted previously, A determines a 


mapping 7:K"—> K™ by the assignment v'> Av. (Here the vectors in K” and K™ 
are written as columns.) We claim that T is linear. For, by properties of matrices, 


Tiv+w) = Ajv+tw) = Av+ Aw = T(r) + Tw) 
and T(kv) = A(kv) = kAv = kT) 
where v,w€ K" and kKE K. 


We comment that the above type of linear mapping shall occur again and again. In 
fact, in the next chapter we show that every linear mapping from one finite-dimensional 
vector space into another can be represented as a linear mapping of the above type. 

Example 6.11: Let F':R?—R® be the “projection” mapping into the xy plane: F(x, y,z) = (a, y, 0). 

We show that F is linear. Let v = (a,b,c) and w= (a’,b’,c’). Then 

F(vu+w) = Fat+a’,b+0b’,c+e’) = (at+a’,b+0’, 0) 
=" (a; 6, 0) (520% 0) — (yr) 
and, for any k ER, 
F(kv) = F(ka, kb, ke) = (ka, kb, 0) = k(a,b,0) = kF(v) 

That is, F is linear. 

Example 6.12: Let &:R2—>R? be the “translation” mapping defined by F(x, y) = (x +1, y+2). 


Observe that F(0) = F(0,0) = (1,2) #0. That is, the zero vector is not mapped 
onto the zero vector. Hence F is not linear. 


Example 6.13: Let F:V—-U _ be the mapping which assigns 0€ U to every v€V. Then, for 
any v,w€V and any kEK, we have 
RO+ wv) = 09= 0-0) — FL) F(t) and E(k) = 90 = k08 = kw) 
Thus F is linear. We call F' the zero mapping and shall usually denote it by 0. 


Example 6.14: Consider the identity mapping [1:V > V which maps each v € V into itself. Then, 
for any v,w€V and any a,b€K, we have 


I(av + bw) = av+ bw = al(v) + bI(w) 
Thus I is linear. 


Example 6.15: Let V be the vector space of polynomials in the variable ¢ over the real field R. 
Then the differential mapping D:V-—>V _ and the integral mapping J:V-R 
defined in Examples 6.4 and 6.5 are linear. For it is proven in calculus that for any 
u,vEV and kER, 


dutv) _ du , dv d(ku) du 
de ge ger i BRS ores ages 


that is, D(u+v) = D(u) + D(v) and D(ku) = k D(u); and also, 
1 1 1 
(u(t) + v(t))dt = t) dt + t)d 
j U v f u(t) f v(t) dt 


0 0 
1 1 
and f kKut)idi.— a) u(t) dt 
0 0 


that is, Jut+v) = J(u)t+ Gv) and J(ku) =k Glu). 


CHAP. 6] LINEAR MAPPINGS 125 


Example 6.16: Let F :V->U bea linear mapping which is both one-one and onto. Then an inverse 
mapping F-!:U>V exists. We will show (Problem 6.17) that this inverse map- 
ping is also linear. 


When we investigated the coordinates of a vector relative to a basis, we also introduced 
the notion of two spaces being isomorphic. We now give a formal] definition. 


Definition: A linear mapping F:V-— U is called an isomorphism if it is one-to-one. The 


vector spaces V, U are said to be isomorphic if there is an isomorphism of 
V onto U. 


Example 6.17: Let V be a vector space over K of dimension n and let {e,..., €n} be a basis of V. 
Then as noted previously the mapping vt [v],, ie. which maps each v € V into 
its coordinate vector relative to the basis {e;}, is an isomorphism of V onto K”. 


: Our next theorem gives us an abundance of examples of linear mappings; in particular, 
it tells us that a linear mapping is completely determined by its values on the elements 
of a basis. 


Theorem 6.2: Let V and U be vector spaces over a field K. Let {v1, v2, ..., vn} be a basis 
of V and let 1, us, ...,Un be any vectors in U. Then there exists a unique 
linear mapping F:V->U _ such that F(v1) = wm, F(v2) = ue, ..., (Un) = Un. 


We emphasize that the vectors w,...,Un in the preceding theorem are completely ar- 
bitrary; they may be linearly dependent or they may even be equal to each other. 


KERNEL AND IMAGE OF A LINEAR MAPPING 
We begin by defining two concepts. 


Definition: Let F:V—>U bea linear mapping. The image of F, written Im F’, is the set 
of image points in U: 


ImF = {1¢€U0: Fv) =u tforsome v €V} 


The kernel of F, written Ker F, is the set of elements in V which map into 


O.E-U: 
Ker F = {vEV: Fv). +03 


The following theorem is easily proven (Problem 6.22). 


Theorem 6.3: Let F:V7>U be a linear mapping. Then the image of F is a subspace 
of U and the kernel of F is a subspace of V. 


Example 6.18: Let F':R?—>R® be the projection map- 
ping into the xy plane: F(x,y,z) = 
(x, y, 0). Clearly the image of F' is the 
entire xy plane: 


Im F = {(a, b, 0): a,b € R} 


© = (a, b,c) 


Note that the kernel of F’ is the z axis: 


Ker F = {(0,0,c): cE R} De ee a ee 

P VF) = (a, 6,0 

since these points and only these points : (c ) 
map into the zero vector 0 = (0,0, 0). 


126 LINEAR MAPPINGS [CHAP. 6 


Now suppose that the vectors v1,...,Vn generate V and that F:V->U is linear. We 
show that the vectors F(vi),...,F (vn) € U generate ImF. For suppose ue ImF; then 
F(v) =u for some vector v € V. Since the uv; generate V and since v € V, there exist 
scalars a1,...,@n for which v = div1 + @2v2+ +--+ +4nvn. Accordingly, 


u = F(v) = Flawitaevet +++ +Gn0n) = a1 F (v1) + G2 (v2) + +++ + On F(vn) 
and hence the vectors F(v1),...,F (vn) generate Im F’. 


Example 6.19: Consider an arbitrary 4 x 8 matrix A over a field K: 


Gy Ga, G3 


which we view as a linear mapping A: K? > K4. « Now the usual basis {@1, €y, 3} of 
K generates K3 and so their values Ae,,Aé),Ae3 under A generate the image of A. 
But the vectors Ae,, Aes, and Aes are the columns of A: 


a, Ag ag 1 ay a; A, as dg 
b, by by by by by bs by 
= < — = if — 
Ae, ey 1g 3 : Crh” Bey Gi Gy Co 
dy dy ds dy \d, dy ds dy 
Q, dp G3 ; as 
ban Op 0 b 
dea le RDS lees 3 
Cie Coe C3 1 C3 
d, dy, ds ds 


thus the image of A is precisely the column space of A. 


We emphasize that if A is any mxXn matrix over K viewed as a linear mapping 
A:k"- K™, then the image of A is precisely the column space of A. 


So far we have not related the notion of dimension to that of a linear mapping 
F:V->U. In the case that V is of finite dimension, we have the following fundamental 
relationship. 


Theorem 6.4: Let V be of finite dimension and let F:V—>U bea linear mapping. Then 
dim V = dim(Ker F) + dim(Im F) 


That is, the sum of the dimensions of the image and kernel of a linear mapping is equal 
to the dimension of its domain. This formula is easily seen to hold for the projection 
mapping F' in Example 6.18. There the image (xy plane) and the kernel (z axis) of F’ have 
dimensions 2 and 1 respectively, whereas the domain R* of F' has dimension 3. 


Remark: Let F:V->U bea linear mapping. Then the rank of F is defined to be the 
dimension of its image, and the nullity of F is defined to be the dimension of its 


kernel: 
rank (Ff) = dim(ImF) and __ nullity(F) = dim (Ker F) 


Thus the preceding theorem yields the following formula for F when V has finite 
dimension: 
rank(F) + nullity(F) = dim V 
Recall that the rank of a matrix A was originally defined to be the dimension of its column 
space and of its row space. Observe that if we now view A as a linear mapping, then both 
definitions correspond since the image of A is precisely its column space. 


CHAP. 6] LINEAR MAPPINGS 127 


SINGULAR AND NONSINGULAR MAPPINGS 


A linear mapping F:V-U is said to be singular if the image of some nonzero vector 
under F is 0, ie. if there exists v € V for which v ~0 but F(v)=0. Thus F:V-U is 
nonsingular if only 0 €V maps into 0€U or, equivalently, if its kernel consists only of 
the zero vector: Ker F = {0}. 


Example 6.20: Let F:R3—>R3 be the linear mapping which rotates a vector about the z axis 
through an angle @: 


F(x, y, 2) = (“cos@—ysin6, x sin@ + y cos 6, Zz) 


Observe that only the zero vector is mapped into the zero vector; hence F is non- 
singular. : 
Now if the linear mapping F:V—>U_ is one-to-one, then only 0€V can map into 
0 €U and so F is nonsingular. The converse is also true. For suppose F is nonsingular 
and F(v) = F(w); then F(v—w) = F(v)—F(w)=0 and hence v-w=0 or v=w. Thus 
F(v) = F(w) implies v = w, that is, F is one-to-one. By definition (page 125), a one-to-one 
linear mapping is called an isomorphism. Thus we have proven 
Theorem 6.5: A linear mapping #':V->U is an isomorphism if and only if it is non- 
singular. 
We remark that nonsingular mappings can also be characterized as those mappings 
which carry independent sets into independent sets (Problem 6.26). 


LINEAR MAPPINGS AND SYSTEMS OF LINEAR EQUATIONS 


Consider a system of m linear equations in n unknowns over a field K: 


A101 + Ait, +--+ + A1intn = 11 
Qloit1 + Goats + --- + Aen¥n = Do 
AmiX1 + Amn2v2 + +°* +OAmn%nr = er 


which is equivalent to the matrix equation 
Aa =p 
where A = (aij) is the coefficient matrix, and x= (ai) and b=(b;) are the column vectors 
of the unknowns and of the constants, respectively. Now the matrix A may also be viewed 
as the linear mapping 
TNE setts os 6 Gi 


Thus the solution of the equation Az = b may be viewed as the preimage of b © K” under 
the linear mapping A:K">K™. Furthermore, the solution of the associated homoge- 
neous equation Az =0 may be viewed as the kernel of the linear mapping A: Kk" k™. 
By Theorem 6.4, 
dim(Ker A) = dim K" — dim(Im A) = n— rankA 
But is exactly the number of unknowns in the homogeneous system Ax=0. Thus we 
have the following theorem on linear equations appearing in Chapter 5. 


128 LINEAR MAPPINGS [CHAP. 6 


Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear 
equations AX = 0 is n—r where v is the number of unknowns and r is the 


rank of the coefficient matrix A. 


OPERATIONS WITH LINEAR MAPPINGS 


We are able to combine linear mappings in various ways to obtain new linear mappings. 
These operations are very important and shall be used throughout the text. 


Suppose F:V>U and G:V-U are linear mappings of vector spaces over a field K. 
We define the sum F+G to be the mapping from V into U which assigns F(v) + G(v) to 
Vey: 

(FE G)(v) = F@) 42G() 


Furthermore, for any scalar k € K, we define the product kF to be the mapping from V 
into U which assigns k F(v) to v EV: 
(kF)(v) = kF(v) 
We show that if F and G are linear, then / + G and KF are also linear. We have, for any 
vectors v,w €V and any scalars a,b € K, 
(F+G)(av+bw) = F(av+bw) + G(av+ bw) 

= aF(v) + bF(w) + aG(v) + bG(w) 

= a(F(v)+G(v)) + b(F(w)+G(w)) 

=) a G) (0) 0b G)( 4b) 
and (kF\(av+bw) = kF(av+bw) = k(aF(v) + bF(w)) 

= akF(v) + bkF(w) = a(kF)(v) + O(KF)(w) 
Thus F'+G and kF are linear. 
The following theorem applies. 

Theorem 6.6: Let V and U be vector spaces over a field K. Then the collection of all 


linear mappings from V into U with the above operations of addition and 
scalar multiplication form a vector space over K. 


The space in the above theorem is usually denoted by 
Hom (V, U) 
Here Hom comes from the word homomorphism. In the case that V and U are of finite 
dimension, we have the following theorem. 
Theorem 6.7: Suppose dimV=m and dimU=n. Then dim Hom(V, U) = mn. 
Now suppose that V, U and W are vector spaces over the same field K, and that F:V>U 


and G:U->W are linear mappings: 


U 


Recall that the composition function GoF is the mapping from V into W defined by 
(GoF)(v) = G(F(v)). We show that GoF is linear whenever F and G are linear. We have, 
for any vectors v,w €V and any scalars a,b € K, 


(GoF)(av + bw) G(F(av+bw)) = G(aF(v) +bF(w)) 
aG(F(v)) + bG(F(w)) = a(GoF)(v) + b(GoF)(w) 


II 


II 


That is, GoF is linear. 


CHAP. 6] LINEAR MAPPINGS 129 


The composition of linear mappi iti Pi Ei 
ppings and that of addition an al 
Bere ed wee rOllGwic: ition and scalar multiplication are 


Theorem 6.8: Let V, U and W be vector spaces over K. Let F,F’ be linear mappings from 
V into U and G,G’ linear mappings from U into W, and let k€@ K. Then: 


(i) Go(F+F’) = GoF + GoF” 
(ii) (G@+G@)oF = GoF + GoF 
(iii) K(GoF) = (kG)oF = Go(kF). 


ALGEBRA OF LINEAR OPERATORS 


Let V bea vector space over a field K. We now consider the special case of linear map- 
pings T:V> V, ie. from V into itself. They are also called linear operators or linear 
transformations on V. We will write A(V), instead of Hom(V, V), for the space of all such 
mappings. 


By Theorem 6.6, A(V) is a vector space over K; it is of dimension n? if V is of dimension 
n. Now if T,S€A(V), then the composition SoT exists and is also a linear mapping 
from V into itself, ie. SoT € A(V). Thus we have a “multiplication” defined in A(V). 
(We shall write ST for SoT in the space A(V).) 


We remark that an algebra A over a field K is a vector space over K in which an opera- 
tion of multiplication is defined satisfying, for every F,G,H € A and every kEK, 


(i) F(G+H) = FG + FH 
(ii) (G+H)F = GF + HF 
(iii) K(GF) = (kG)F = G(kF). 


If the associative law also holds for the multiplication, i.e. if for every F,G,H EA, 

(iv) (FG)H = F(GH) 
then the algebra A is said to be associative. Thus by Theorems 6.8 and 6.1, A(V) is an 
associative algebra over K with respect to composition of mappings; hence it is frequently 
called the algebra of linear operators on V. 


Observe that the identity mapping 1:V—>V belongs to A(V). Also, for any T € A(V), 
we have TI=IT=T. We note that we can also form “powers” of 7; we use the notation 
T?2=ToT, T?3=ToToT,.... Furthermore, for any polynomial 


(2) = Got Gin + 2X? + +> + ana”, a, € K 


_ we can form the operator p(T) defined by 
DIP) Sedo 4 Ot FOF? +o #2 ant” 


(For a scalar k € K, the operator kI is frequently denoted by simply k.) In particular, if 
p(T) = 0, the zero mapping, then T is said to be a zero of the polynomial p(z). 


Example 6.21: Let 7:R*—>R® be defined by T(x, y,2) = (0,2,y). Now if (a,b,c) is any element 
of R32, then: 
(T+ D(a, b,c) = (0,4, b) + (a, b, @) = (GyGha ety Os2@) 


and T3(a, b,c) = T(0,a,b) = T(0, 0, ajo —=-(0; 0370) 


Thus we see that 73 =0, the zero mapping from V into itself. In other words, 
T is a zero of the polynomial p(x) = 2°. 


130 LINEAR MAPPINGS [CHAP. 6 


INVERTIBLE OPERATORS 


A linear operator T7:V—>V_ is said to be invertible if it has an inverse, i.e. if there 
exists T~1€ A(V) such that TT 4=T T =1T. 


Now T is invertible if and only if it is one-one and onto. Thus in particular, if T is 
invertible then only 0 € V can map into itself, i.e. 7 is nonsingular. On the other hand, 
suppose 7 is nonsingular, i.e. Ker 7 = {0}. Recall (page 127) that T is also one-one. More- 
over, assuming V has finite dimension, we have, by Theorem 6.4, 


dimV = dim(ImT) + dim(KerT) =~ dim(ImT) + dim({0}) 
==) dim (tia?) <0 aie dina (ine) 


Then Im7=V, i.e. the image of T is V; thus T is onto. Hence T is both one-one and onto 
and so is invertible. We have just proven 


Theorem 6.9: A linear operator 7:V-V_ on a vector space of finite dimension is in- 
vertible if and only if it is nonsingular. 


Example 6.22: Let T be the operator on R2 defined by T(a, y) = (y, 2x—y). The kernel of T is 
{(0, 0)}; hence 7 is nonsingular and, by the preceding theorem, invertible. We now 
find a formula for T7~1. Suppose (s, t) is the image of (x, y) under T; hence (a, y) 
is the image of (s, t) under 7-1: T(x, y) =(s,t) and T—\(s, t) = (x, y). We have 


1 Go) =O ae = OV) eS, and so = SP 


Solving for ~ and y in terms of s and t, we obtain x=4s+14t,y=s. Thus T7! 
is given by the formula T~1(s, t) = ($s + df, 8). 


The finiteness of the dimensionality of V in the preceding theorem is necessary as seen 
in the next example. 


Example 6.23: Let V be the vector space of polynomials over K, and let T be the operator on V 


defined by 
Gi ae OG aro se CaO == AG aekene ae PSS ap UO E 


i.e. T increases the exponent of ¢ in each term by 1. Now T is a linear mapping 
and is nonsingular. However, T is not onto and so is not invertible. 


We now give an important application of the above theorem to systems of linear 
equations over K. Consider a system with the same number of equations as unknowns, 
say n. We can represent this system by the matrix equation 

Apt D (*) 
where A is an n-square matrix over K which we view as a linear operator on K”. Suppose 
the matrix A is nonsingular, i.e. the matrix equation Aw =0 has only the zero solution. 
Then, by Theorem 6.9, the linear mapping A is one-to-one and onto. This means that the 
system (*) has a unique solution for any b € K”. On the other hand, suppose the matrix 
A is singular, i.e. the matrix equation Az =0 has a nonzero solution. Then the linear 
mapping A is not onto. This means that there exist b € K" for which (*) does not have a 
solution. Furthermore, if a solution exists it is not unique. Thus we have proven the 
following fundamental result: 


Theorem 6.10: Consider the following system of linear equations: 


) 


One Onion ie COO aa Ohn in He () 


Qoi%1 + A22%e + +--+ + Aontn = Deo 


Ore) (0! 0) se wie) eho) eon ee wf ‘eiarie) SipemuMel tele nelle euberelt cris: 


Ani%1 + Ano%e + -°° + AnnXn = Dn 


CHAP. 6] LINEAR MAPPINGS 131 


(i) If the corresponding homogeneous system has only the zero solution, 
then the above system has a unique solution for any values of the bj. 


(ii) If the corresponding homogeneous system has a nonzero solution, then: 
(i) there are values for the b; for which the above system does not have 
a solution; (ii) whenever a solution of the above system exists, it is 
not unique. 


Solved Problems 


MAPPINGS 
6.1. State whether or not each diagram defines a mapping from A = {a,b,c} into 
B= {x,y, 2}. 


va a L< 


(i) (ii) (iii) 
(i) No. There is nothing assigned to the element bE A. 
(ii) No. Two elements, x and z, are assigned to cE A. 


(iii) Yes. 


6.2. Use a formula to define each of the following functions from R into R. 


(i) To each number let f assign its cube. 

(ii) To each number let g assign the number 5. 

(iii) To each positive number let h assign its square, and to each nonpositive number 
let h assign the number 6. 


Also, find the value of each function at 4, —2 and 0. 


(i) Since f assigns to any number 2 its cube x*, we can define f by f(x) = x3. Also: 
f(a) = 8 = 64, f(-2) = (-2)9 = -8,  f(0) = 08 = 0 


(ii) Since g assigns 5 to any number #, we can define g by g(x) = 5. Thus the value of g at each 


number 4, —2 and 0 is 5: 
g(4) = 5, g(-2) = 5, g(0) = 5 


(iii) Two different rules are used to define h as follows: 


ey? if «>0 
h(x) = ; 
6 if «=0 


Since 4>0, h(4) = 42 =16. On the other hand, —2,0 = 0 and so h(—2) = 6, h(0) = 6. 


132 LINEAR MAPPINGS [CHAP. 6 


6.3. Let A= ({1,2,3,4,5} and let f:A>A_ be the map- 
ping defined by the diagram on the right. (i) Find 
the image of f. (ii) Find the graph of f. 


(i) The image f(A) of the mapping f consists of all the points 
assigned to elements of A. Now only 2, 3 and 5 appear as 
the image of any elements of A; hence f(A) = (2,3, 5}. 


(ii) The graph of f consists of the ordered pairs (a, f(a)), 
where aG A. Now f(1) =3, f(2) =5, f(8) = 5, F4 = 2, 
f(5) = 8; hence the graph of 


f = {(1,8), (2,5), (3,5), (4,2), (5, 3)} 


6.4... Sketch the graph.of: (i) (x)= +2—6, (ii) o@) =a? sats: 


Note that these are “polynomial functions”. In each case set up a table of values for x and 
then find the corresponding values of f(x). Plot the points in a coordinate diagram and then draw 
a smooth continuous curve through the points. 


6.5. Let the mappings f:4>B and g:B>C be defined by the diagram 


A if B g C 


(i) Find the composition mapping (gof):A->C. (ii) Find the image of each map- 
ping: f,g and gof. 
(i) We use the definition of the composition mapping to compute: 
(9° f\(a) = g(f(a)) = oly) = t 
(9 ° f(b) 9(f()) = g(x) = 8 
(9° fc) = g(f(e)) = gy) = t 


Observe that we arrive at the same answer if we “follow the arrows” in the diagram: 


Cao )] 2 i 2 hy = 3, c>y->t 


CHAP. 6] 


6.6. 


6.7. 


6.8. 


(ii) 


Let 


LINEAR MAPPINGS 133 


By the diagram, the image values under the mapping f are « and y, and the image values under 
g are r, s and t; hence 


image of f = {x,y} and image of g = {7,s, t} 


By (i), the image values under the composition mapping g°f are t and s; hence image of 
9°f = {s,t}. Note that the images of g and g°f are different. 


the mappings f and g be defined by f(x) =2x+1 and 9(4): = 22 — 2. (i) Find 


(9 ae and (fog)(4). (ii) Find formulas defining the composition mappings gof 
and fog. 


(i) 


(ii) 


f(4) = 2°4+1= 9. Hence (gof)(4) = g(f(4)) = g(0) = 92 — 2 = 79. 
g(4) = 4424-2 = 14. Hence (fog)(4) = f(g(4)) = f(14) = 201441 = 29. 


Compute the formula for g°f as follows: 
(9°f)() = g(f(x)) = g(2e+1) = (2a+1)2?—2 = 4a2+ 4a —1 
Observe that the same answer can be found by writing y= f(x) =2xe+1 and z=g(y)= 
y?— 2, and then eliminating y: z= y2—2 = (2a+1)2—2 = 4424+ 4¢—-1. 
(fog)(~) = f(g(x)) = f(w?—2) = 2(a2-2) +1 = 242-83. Observe that fog xX gof. 


Let the mappings f:A>B,g:B>C and h:C-D be defined by the diagram 


Determine if each mapping (i) is one-one, (ii) is onto, (iii) has an inverse. 


(i) 


(ii) 


(iii) 


The mapping f:A-—B is one-one since each element of A has a different image. The mapping 
g:B-C is not one-one since « and z both map into the same element 4. The mapping h:C- D 


is one-one. 


The mapping f:A-—B is not onto since z€B is not the image of any element of A. The 
mapping g:B-C is onto since each element of C is the image of some element of B. The 


mapping h:C- D is also onto. 
A mapping has an inverse if and only if it is both one-one and onto. Hence only h has an 


inverse. 


Suppose f:4>B and g:B->C; hence the composition mapping (gof):A-> C exists. 
Prove the following. (i) If f and g are one-one, then gof is one-one. (ii) If f and g 
are onto, then gof is onto. (iii) If gof is one-one, then f is one-one. (iv) If gof is 


onto, then g is onto. 


(i) 


(ii) 


(iii) 


(iv) 


Suppose (g°f)(x) =(g°f)(y). Then g(f(x)) = g(f(y)). Since g is one-one, f(x) = f(y). Since f 
is one-one, x = y. We have proven that (g of)(x) = (g°f)(y) implies «= y; hence g°f is 
one-one. 

Suppose ¢ €C. Since g is onto, there exists 6 €B for which g(b) =c. Since f is onto, there 
exists a € A for which f(a) = 6. Thus (g°f)(a) = 9(f(a)) = g(b) = c; hence g°f is onto. 


Then there exists distinct elements x,y €A for which f(x) = f(y). 


Suppose f is not one-one. 
we Accordingly if g°of 


Thus (g°f)(a) = g(f(#)) = 9(fy)) = (9° f)(y); hence g°f is not one-one. 
is one-one, then f must be one-one. 

If a€A, then (g°f)(a) = g(f(a)) © g(B); hence (g°f)(A) Cg(B). Suppose g is not onto. 
Then g(B) is properly contained in C and so (g°f)(A) is properly contained in C; thus g°f is 
not onto. Accordingly if g°f is onto, then g must be onto. 


134 


6.9. 


6.10. 


LINEAR MAPPINGS (CHAP. 6 


Prove that a mapping f:4-—>B has an inverse if and only if it is one-to-one and onto. 


Suppose f has an inverse, i.e. there exists a function f—-1:B—>A for which pia aly and 
fof-1=1,. Since 1, is one-to-one, f is one-to-one by Problem 6.8(iii); and since 1p is onto, f is 
onto by Problem 6.8(iv). That is, f is both one-to-one and onto. 


Now suppose f is both one-to-one and onto. Then each b € B is the image of a unique element 
A A i 
in A, say ie Thus if f(a) = 6, then a=; hence f(b) = b. Now let g denote the mapping from 
aw 
BtoA defined by 6 ' 0b. We have: 
aw 


(i) (g°f)(a) = g(f(a)) = g(b) = b =a, for every a€ A; hence gof = 1y. 
(ii) (f°g)(b) = f(g(b)) = Hd ) = 6, for every bE B; hence fog = te. 


Accordingly, f has an inverse. Its inverse is the mapping g. 


Let f:R->R be defined by f(z) =2x2—8. Now f is one-to-one and onto; hence f 
has an inverse mapping f~!. Find a formula for f7'. 


Let y be the image of x under the mapping f: y= f(x) = 2a—8. Consequently x will be the 
image of y under the inverse mapping f~1. Thus solve for x in terms of y in the above equation: 
x = (y+8)/2. Then the formula defining the inverse function is f—1(y) = (y+ 8)/2. 


LINEAR MAPPINGS 


6.11. 


6.12. 


Show that the following mappings F are linear: 
(i) F:R?>R? defined by F(x, y) = (x+y, 2). 
(ii) F:R'>R defined by F(a, y, 2) = 2x — 3y + 4z. 
(i) Let v=(a,b) and w=(a’;b’); hence 
Bae) = (SG, War (0) and ku = (ka ich); kER 
We have F(v) = (a+ 6,a) and F(w) = (a’+0’,a’). Thus 
IHEP VD) = TANGER Wa) = (SSN Se) 42 OG 0) 
(Qai10,-0) (Oi 0a) (a) 
and LUE) — OUD) = Mohs, (40)) —= IWGaie DO, OA) = 1A) 
Since v, w and k were arbitrary, F is linear. 
(ii) Leto =a, b,c) and w= (a’,b’,c’); hence 
Daw == Mihar Cn Oa i eaart)) Enel VO =a (gia lo id), IRR 
We have F(v) = 2a—3b+4ce and F(w) = 2a’—3b’+ 4c’. Thus 
Fwt+w) = F(at+a,b+0b',e+c’) = 2a+a’) —3(6+4+ db’) + 4(e4+ c’) 
= (2a— 306+ 4c) + (2a’— 3b’+4c’) = F(v) + F(w) 
and F(kv) = F(ka, kb, ke) = 2ka — 3kb + 4ke = k(2a—3b+ 4c) = kF(v) 


Accordingly, F' is linear. 


Show that the following mappings F are not linear: 
(i) F:R?>R defined by F(x, y) = xy. 
(ii) F:R?>R* defined by F(x, y) = (a+1, 2y,4+). 
(iii) F:R?> R? defined by F(x, y,z) = (|x|, 0 
(i) Let v=(1,2) and w= (3,4); then v+w = (4, 6). 
We have F(v) =1*°2=2 and F(w) =3°4=12. Hence 


CHAP. 6] LINEAR MAPPINGS 135 


F(v+w) = F(4,6) = 4°6 = 24 # F(v) + F(w) 
Accordingly, F is not linear, 
(ii) Since F(0,0) = (1,0,0) ¥ (0, 0,0), F cannot be linear. 
(iii) Let v = (1,2,3) and k=—8; hence kv = (—3, —6, —9). 
We have F(v) = (1,0) and so kF(v) = —3(1,0) = (—8,0). Then 


F(kv) = F(—8, —6,—9) = (8,0) # kF(v) 
and hence F is not linear. 


6.13. Let V be the vector space of n-square matrices over K. Let M be an arbitrary matrix 


6.14. 


in V. Let T:V->V be defined by T(A) = AM+MA, where A€EV. Show that 
T is linear. 


For any A,B €V and any kE€K, we have 
T(A+B) = (A+B)M+M(A+B) = AM+BM+MA+MB 
= (AM+MA) + (BM+MB) = T(A) + T(B) 
and T(kA) = (kKA)M+M(kA) = k(AM)+k(MA) = kK(AM+MA) = kT(A) 


Accordingly, T is linear. 


Prove Theorem 6.2: Let V and U be vector spaces over a field K. Let {v1,...,Un} be 
a basis of V and let w,...,un be any arbitrary vectors in U. Then there exists a 


“ unique linear mapping F':V->U such that F(v1) =m, F(v2) = ue, ..., F(vn) = Un. 


There are three steps to the proof of the theorem: (1) Define a mapping F':V—>U_ such that 
F(v,) =u, 1=1,...,n. (2) Show that F' is linear. (3) Show that F' is unique. 


Step (1). Let v EV. Since {v1,...,v,} is a basis of V, there exist unique scalars a;,...,4, EK 
for which v = ayv, + Gq¥g+-+++Gn¥n,. We define F:V>U by Flv) = ayuy + dota + +++ + antty. 
(Since the a; are unique, the mapping F is well-defined.) Now, for 7=1,...,%, 

Oy = VG Gr O88 SPA ae 20 = UG, 


Hence J ACO) = DIR) ROS Se ie OC S(T er, 
Thus the first step of the proof is complete. 
Step (2). Suppose v = a,¥1 + agvgt+*+ +a,v, and w= byvy+ bovot +++ +b,v,. Then 
vt w = (a, + by)vy + (Ag + Do)¥q + -°* + (Ant On) Vn 
and, for any kEK, kv = ka,v, + kagvg+ +++ +ka,v,. By definition of the mapping F’, 
Bo) -= aye deta > +a,%, and iHOD) = Wi ar OOS TP OOO a= Oy 
Hence F(v tw) = (a, + by)uy + (dg + bo)ug + +++ + (dn + bn )Un 
= (yy + Any + °° +Aqty) + (byuy + Doty + +++ + O,Un) 
F(v) + F(w) 
and F(kv) = kayuytagtgt+ +++ +a,u,) = kF(v) 


lI 


Thus F is linear. 
Step (3). Now suppose G:V—>U is linear and G(v) =u, i=1,...,n. If v = a1 + Agvet 
ikea 0) ten 
Give) = Glayvy + agvg + 2+ + ann) = AyG(0y) + agG(vq) + +++ + AnG(Yn) 
= AyUy + Agtg + 2° + Only = Flv) 


Since G(v) = F(v) for every vEV, G=F. Thus F is unique and the theorem is proved. 


136 


6.15. 


6.16. 


6.17. 


LINEAR MAPPINGS [CHAP. 6 


Let T7:R2—>R be the linear mapping for which 
T(1,1) = 3 and (0,1) = =2 (1) 
(Since {(1,1), (0,1)} is a basis of R?, such a linear mapping exists and is unique by 
Theorem 6.2.) Find T(a, b). 
First we write (a,b) as a linear combination of (1,1) and (0,1) using unknown scalars x and y: 
(CH eal al) Sev, 1H) (2) 
Then (G,-b)e= 272) (05) —— (ea 7) andsso Ce ya 
Solving for x and y in terms of a and b, we obtain 
hp es OR ENG. Pe 1 HG) (3) 
Now using (Z) and (2) we have 
E(a;-b) =P (ad; Wy y(0; 1) SS aT yl (0) oa 2y 
Finally, using () we have T(a,b) = 3” — 2y = 3(a) — 2(b—a) = Das 1205 


Let T:V->U be linear, and suppose ,...,Un€@V_ have the property that their 
images T7(v1),...,7(vn) are linearly independent. Show that the vectors v1,...,Un 
are also linearly independent. 
Suppose that, for scalars ay, ..+5Qn, @Vy + AqVot ++: +a,v, = 0. Then 
0 = TO) = Liaw ayy ors as) a wy) Ooo) tee te On) 


Since the T(v;) are linearly independent, all the a,;=0. Thus the vectors v,,...,v, are linearly 
independent. 


Suppose the linear mapping F': V>U is one-to-one and onto. Show that the inverse 
mapping F-!:U-V is also linear. 


Suppose u,u’ © U. Since F is one-to-one and onto, there exist unique vectors v,v’G@V_ for 
which F(v) =u and F(v’) =w’. Since F is linear, we also have 


Ose C= THOS AE) SS OS Pan and MIE SHO = (ER 
By definition of the inverse mapping, F-}(u) = v, F-'\w) = v’, F-\utu’) = vt+v’ and 
F-\(ku) = kv. Then 
IR AMOR OI) Ss OSE al = VN MOM VR eval Phu) — k0s= Ok a '(a) 


and thus F’~! is linear. 


IMAGE AND KERNEL OF LINEAR MAPPINGS 


6.18. 


Let F’':R*> R® be the linear mapping defined by 
Ka, Y,. 80) =. (ays te 2s —t, oy es ou) 
Find a basis and the dimension of the (i) image U of F, (ii) kernel W of F. 


(i) The images of the following generators of R* generate the image U of F: 
FAL, 0, 050) = (1, U1) F(0,0,1,0) = (1, 2, 3) 
INO. Oy) = ik, Oe) E(O;-O 30501) 5 ae ea 3) 


Form the matrix whose rows are the generators of U and row reduce to echelon form: 


je Nakerge al Wert fl Dae oom | DES te Sma 

= iQ - il Oar lee 2, Oe eee 
to to 

‘1a rok Oe Otis? OPO 0 

1-1 -—3 0 -2 -4 (yy OW) 


Thus {(1,1,1), (0,1, 2)} is a basis of U; hence dim U = 2. 


CHAP. 6] LINEAR MAPPINGS 137 


(ii) We seek the set of (x,y, 8,t) such that F(x, y,s, t) = (0,0, 0), i.e., 
F(x, y, 8, t) = (w—y+s+t,a+2s—t,x+y+3s—3t) = (0,0,0) 


Set corresponding components equal to each other to form the following homogeneous system 
whose solution space is the kernel W of F: 


LO OS! 3 SEOs a7 1) = ae Sark te eS 
5 BP ae = 
x pias het 1) or y+ s—2t = 0 or Capa earn 
War C= ep = 0 
GE SPO) ar oS = Be = 0) PAE ae PAS > Lhe == (1) 


The free variables are s and t; hence dim W = 2. Set 
(a) s =—1, t=0 to obtain the solution (2, 1, —1, 0), 
(b) s=0, t =1 to obtain the solution (OO), 1). 


Thus {(2,1,—1, 0), (1,2,0,1)} is a basis of W. (Observe that dimU+dimwW =24+2=4 
which is the dimension of the domain R# of F.) 


? 


6.19. Let 7T:R* > R® be the linear mapping defined by 
LG; 2) = (Oi 2y 2, ¥ £2,504 y'— 22) 
Find a basis and the dimension of the (i) image U of T, (ii) kernel W of T. 


(i) The images of generators of R* generate the image U of T: 
71,10, 0) = (1,0,1), . T(0,1;0) = (2,1,1), (0,0, 1) = (—1,1,—2) 


Form the matrix whose rows are the generators of U and row reduce to echelon form: 


rey Oa TQ al It yal 
Pod GO to Dy ak al to Oe ahs et 
elie ule OP) aheieal On OTRO 


Thus {(1,0,1), (0,1,—1)} is a basis of U, and so dim U = 2. 


(ii) We seek the set of (x,y,z) such that T(x,y,z) = (0,0,0), i.e., 
WG. Up) = (Car Ua ey oar DS 9), = (O05. 0) 


Set corresponding components equal to each other to form the homogeneous system whose 
solution space is the kernel W of T: 


ae wy = 2. = 0 ae aa SY 
Mar fol ee =U 
Ware = WW or Uae e = VW or reas 
ee oie" = 0 Ye O 


The only free variable is z; hence dimW=1. Let z=1; then y=-—1 and x=3. Thus 
{(8, —1, 1)} is a basis of W. (Observe that dimU + dimW = 2+ 1 = 8, which is the dimen- 
sion of the domain R? of T.) 


6.20. Findalinear map F': R?> R‘ whose image is generated by (1, 2, 0, —4) and (2, 0, —l5—8), 


Method 1. 

Consider the usual basis of R#: e, = (1,0,0), eg = (0,1,0), e3 = (0,90, 1). Set F(e,) = (1, 250) —4), 
Bie — (2,051, <8), and F(es) = (0,0,0,0). By Theorem 6.2, such a linear map F’ exists and is 
unique. Furthermore, the image of F is generated by the F(e;); hence F' has the required property. 
We find a general formula for F(x, y, z): 

F(a, y, 2) = F(xey+yegt zes) = xF(e,) + yF (eg) + 2F(é3) 
x(1, 2, 0, =A) a y(2, 0, aks —8) AF 2(0, 0, 0, 0) 


= («+ 2y, 2x, —y, —4x — 8y) 


I| 


138 


6.21. 


6.22. 


6.23. 


LINEAR MAPPINGS [CHAP. 6 


Method 2. 
Form a 4X 3 matrix A whose columns consist only of the given vectors; say, 
A Oe see 
Y Po Dw oD 
her 0 -1 -1 
—4—3 3 


Recall that A determines a linear map A: R*— R4 whose image is generated by the columns of A. 
Thus A satisfies the required condition. 


é Taz 
Let V be the vector space of 2 by 2 matrices over R and let M = 0 . Let 


F:V->V be the linear map defined by F(A) = AM—MA. Find a basis and the 
dimension of the kernel W of F. 


x Y Cone 0 0 
We seek the set of such that F = F 
Smt Seay 0 0 
Pr ap i) as a y\/l1 2\ _ LEZ /A0 6 
‘cmt ae s it 3) Oe BPNGe Ae 
a 2e+ By\ fat 25 IAA 
ie ee 3s 3t 
—2s pare ars) “8 e 0) 
Page 2s BE ENOF LO 
Thus NG <= yy) = Vato = (V) Pe pb == UY) 


or 
Sea) sa— 50 


The free variables are y and t; hence dim W = 2. To obtain a basis of W set 
(a) y=—1, t = 0 to obtain the solution + =1, y=-—1, s=0, t=0; 
(6). y= 0, t=1 to obtain the solution « =1, y=0, s=0, ¢=1. 


Th 1 ol? (oa) f 388 basis of 
us ie 0 ’ c : 1S a DasliS O 4 


Prove Theorem 6.3: Let F':V—>U bea linear mapping. Then (i) the image of F 
is a subspace of U and (ii) the kernel of F' is a subspace of V. 


(i) Since F(0)=0, 0€ ImF. Now suppose u,u’e& ImF and a,b€ K. Since u and w’ belong to 
the image of F,, there exist vectors v,v’€@ V such that F(v)=w and F(v’)=w’. Then 


F(av + bv’) = aF(v) + bF(v’) = aut bu’ ©€ ImF 
Thus the image of F is a subspace of U. 


(ii) Since F(0)=0, 0€ KerF. Now suppose v,w€ KerF and a,b€K. Since v and w belong 
to the kernel of F, F(v) =0 and F(w)=0. Thus 


F(av+ bw) = aF(v) + bF(w) = a0+60 = 0 andso av+bw € KerF 
Thus the kernel of F' is a subspace of V. 


Prove Theorem 6.4: Let V be of finite dimension, and let F:V>U be a linear map- 
ping with image U’ and kernel W. Then dimU’+dimW = dimV. 


Suppose dimV =x. Since W is a subspace of V, its dimension is finite; say, dimW =r =n. 
Thus we need prove that dim U’=n—r. 


CHAP. 6} LINEAR MAPPINGS 139 


6.24. 


Let {w,,...,w,} be a basis of W. We extend {w,} to a basis of V: 
LW so Wes Uys ny Og higt 
Let Be Os) (Ove vt (Une ZS 
The theorem is proved if we show that B is a basis of the image U’ of F. 


Proof that B generates U’. Let u€U’. Then there exists v@V_ such that F(v) =u. Since 
{w,;, v;} generates V and since v € V, 


== UCR ar OO SOO a= ih Se GO SE [iy ae 
where the aj, 6; are scalars. Note that F(w;) =0 since the w; belong to the kernel of F. Thus 
u = Fv) = Flaywyt+ +++ +a,w, + byvy 4 +++ +6, -Un—) 
= a, (w,) +--+: + a,F(w,) + bi F (vy) + +++ + b,_-F(Un_1) 
= (6,0 +. -"> + 4,0 + 6,F(v;) + 23 +b, Fw) 
Shy) ee 0, 7, 
Accordingly, the F(v,;) generate the image of F. 
Proof that B is linearly independent. Suppose 
@,F'(03) + aoF (vy) + >°* + Qn-,-F(n_,) = 0 


Then F(a,vy + dgvgt+ +--+ + a,_,¥n-,) =0 and so a,v,+--:+a,_,v,—, belongs to the kernel W 


of F. Since {w;} generates W, there exist scalars b,,..., 6, such that 
OO a= Ups) P82 Se CRE ie ta 1 iy SOO Seyi 
or QyVy +t On 7p Vg—_~ — Bywy — +++ — bw, = 0 (*) 


Since {w;, v;} is a basis of V, it is linearly independent; hence the coefficients of the w; and v; in (*) 


_are all 0. In particular, a, =0, ..., a,_,=0. Accordingly, the F'(v,) are linearly independent. 


Thus B is a basis of U’, and so dim U’ =n—r and the theorem is proved. 


Suppose f:V-—>U is linear with kernel W, and that f(v) =u. Show that the “coset” 
v+W={v+w: w € W} is the preimage of wu, that is, f-'(v) =v+W. 

We must prove that (i) f—1(u)cCv+W and (ii) v+Wcf-(u). We first prove (i). Suppose 
v’ © f-\(u). Then f(v’) =u and so f(v’—v) = f(v’)—f(v) =u—u=0, that is, v’-—veEW. Thus 
vw’ =v+(ve'—v) €v+W and hence f—1(u) Cu + W. 

Now we prove (ii). Suppose v’€v+W. Then v' =v+w where we W. Since W is the 
kernel of f, f(w) = 0. Accordingly, f(v’) = fvtw) = f(v)t+f(w) = fv) +0 = fw) = u. Thus 
vw €f-\(u) andso vt+Wcf- (u). 


SINGULAR AND NONSINGULAR MAPPINGS 


6.25. 


6.26. 


Suppose F:V->U is linear and that V is of finite dimension. Show that V and the 
image of F have the same dimension if and only if F is nonsingular. Determine all 
nonsingular mappings T: R*> R’. 

By Theorem 6.4, dimV = dim(ImF) + dim (Ker F). Hence V and ImF have the same di- 
mension if and only if dim(KerF) =0 or KerF = {0}, i.e. if and only if F is nonsingular. 

Since the dimension of R3 is less than the dimension of R+, so is the dimension of the image of 
T. Accordingly, no linear mapping 7: R+— R* can be nonsingular. 


Prove that a linear mapping F:V->U is nonsingular if and only if the image of 
an independent set is independent. 

Suppose F is nonsingular and suppose {v1,..-,Un} is an independent subset of V. We claim that 
the vectors F(v,),..-, (vp) are independent. Suppose a,F (v1) + agF (v2) +. +++ + a,F(v,) = 0, where 
a; € K. Since F is linear, F(ayv, + a2 +-++++a,v,) = 0; hence 

@,0; + dgvg + ++: + anu, € Ker F 


140 


LINEAR MAPPINGS (CHAP. 6 
But F is nonsingular, i.e. Ker F = {0}; hence a,v,+a9v9+ +++ +a,v, = 0. Since the v; are linearly 
independent, all the a; are 0. Accordingly, the F(v;) are linearly independent. In other words, the 
image of the independent set {v,, ...,U,} is independent. 


On the other hand, suppose the image of any independent set is independent. If vE&V is 
nonzero, then {v} is independent. Then {F(v)} is independent and so F(v) #0. Accordingly, F’ is 
nonsingular. 


OPERATIONS WITH LINEAR MAPPINGS 


6.27. 


6.28. 


6.29. 


6.30. 


6.31. 


Let F:R?> R?2 and G:R*?> R? be defined by F(x, y, zZ) = (2%,y+z) and G(x, y, z) 
(x—z, y). Find formulas defining the mappings F'+G, 3F and 2F'— 5G. 
CAS CNG a, By == ING) ae Cle 5) 
Sn (QU te) Sinn e Y= Ok eae acme) 
(SE) Aine) eo Oly 2), = 3(20n ye) =" (GX duce oe) 
(2 —5G) Gy yng) = 2hGky, 2) pG(e.yn 2) 2 (25 eee ten) 
= (Cl 0 a7) ae (aa ey Oy) = (ear = Bs), 


Let F:R?> R? and G:R?>R? be defined by F(a, y, z) = (24,y+z) and G(a, y) = 
(y, x). Derive formulas defining the mappings GoF and FoG. 


(GoF)(a,y,z) = G(F(a, y, 2) = G(2e,y+z) = (yt, 2a) 


The mapping F'°G is not defined since the image of G is not contained in the domain of F’.. 


Show: (i) the zero mapping 0, defined by 0(v) = 0 for every v © V, is the zero ele- 
ment of Hom(V, UW); (ii) the negative of / € Hom(V, U) is the mapping (—1)F, ie. 
SST Vs 
(i) Let F © Hom (V,U). Then, for every v€ V, 
CSN) =I) NG) == IA) a (== Jaa 
Since (f+ 0)(v). = F(v) for every vEV, F+0 = F. 
(ii) For every vEV, 
(F+(-1)F)(v) = F(v) + (-1)F(v) = F(v) — F(v) = 0 = 0(v) 


Since (fF + (—1)F)(v) = 0(v) for every v€V, F+(—1)F = 0. Thus (—1)F is the negative 
OLele 


Show that for F,,...,F.€@ Hom(V, U) and qi,...,@n€K, and for any v EV, 
(Q1F'1 + G22 + +--+ 4nF'n)(v) = ak s(v) + ao 2(v) + -+- + OnF'n(v) 


By definition of the mapping a,F,, (a,F)(v) = a,F,(v); hence the theorem holds for n=1. 
Thus by induction, 


(44F y+ dghy + +++ +k ,)(v) = (a4Fy)(v) + (aoFo+ +++ + a,F,)(v) 
= a,F,(v) + a.Fo(v) + >>: + a, F,,() 


Let /': R’> R’, G:R’> R’ and H: R* > R* be defined by F(x, y, z) = (vx +y+z,x2+y), 


G(u, y, 2) = (2ua+2,4+y) and H(z, y,z) = (2y, x). Show that F,G,H € Hom (R°, R’) 
are linearly independent. 


Suppose, for scalars a,b,cE kK, 
QE a WG se Gal = 


(1) 
(Here 0 is the zero mapping.) For e, = (1,0,0) € R3, we have 


(aF + 6G + cH)(e,;) = aF (1, 0, 0) + 6G(1, 0, 0) ++. cH(1, 0, 0) 
a(1, 1) + 6(2, 1) + ¢(0,1) = (a+2b,a+b+e) 


CHAP. 6} LINEAR MAPPINGS 141 


6.32. 


6.33. 


and 0O(e;) = (0,0). Thus by (1), (a+2b,a+6+c) = (0,0) and so 
G20 0 and Gs lee SU (2) 
Similarly for e, = (0,1, 0) € R?, we have 
(aF + bG+cH)(e.) = aF(0, 1, 0) + bG(0, 1, 0) + cH(0, 1, 0) 
= a(1, 1) + 6(0,1) + c(2,0) = (a@+2c,a+56) = O(e.) = (0, 0) 
Thus a+2e=0 and atb=0 (3) 
Using (2) and (3) we obtain w= 0b =0i-c= 0 (4) 


Since (1) implies (4), the mappings F, G and H are linearly independent. 


Prove Theorem 6.7: Suppose dim V = m and dimU =n. Then dim Hom (V,U) =mn. 


Suppose {V1, -..,Um} is a basis of V and {u,,...,u,} is a basis of U. By Theorem 6.2, a linear 
mapping in Hom (V, U) is uniquely determined by arbitrarily assigning elements of U to the basis 
elements v; of V. We define 

F,; © Hom (V, U), ae eae se) ill serene 
to be the linear mapping for which F;,(v;) = u;, and F',(v,) =0 for k#%. That is, Fy; maps v; 
into u; and the other v’s into 0. Observe that {F',;} contains exactly mn elements; hence the theorem’ 
is proved if we show that it is a basis of Hom (V, U). 


Proof that {Fj;} generates Hom(V,U). Let FG Hom(V,U). Suppose F(v,) = wy, F(v2) = 


Wo, .--, F(Vm) = Wm. Since w, € U, itis a linear combination of the u’s; say, 
Wy = Ap Uy + Apots + nae aire bign Ula DS ae ote a,;EK (1) 
m n 
Consider the linear mapping G = 3S a,;F';;. Since G is a linear combination of the Fj, the 
ite 


proof that {F;} generates Hom (V, U) is complete if we show that F = G. 


We now compute G(v;), kK=1,...,m. Since F;,(v,) =0 for k#i and Fyj(v_) = ui, 


m n n n 
G(v;,) = BS = Oi By (Yu) = = Oj BK; (Y,.) = > A Uj 
i=1 j=1 j=1 j=1 
= AgyUly T Apgllg TF +7? 1H ApnUy 
Thus by (1), G(v,) = w, for each k. But F(v,) = w, for each k. Accordingly, by Theorem 6.2, 
F =G; hence {F;;} generates Hom (V, U). 
Proof that {F;;} is linearly independent. Suppose, for scalars a); © K, 
™m n 
DPF a 
i=1 j=1 


For v,, k=1,...,m, 


m n n n 
= 10%, = 2 = agF3yC) = = Hej Fig (My) = = PaCS) 
i=1 j= = = 
= Ani Sie Apolo se Doe Se Ann 
But the uw; are linearly independent; hence for k=1,...,m, we have a,; = 0, a.=0, ..., Gy, = 9. 


In other words, all the a;;=0 and so {F;,} is linearly independent. 


Thus {F;;} is a basis of Hom (V,U); hence dim Hom (V, U) = mn. 


Prove Theorem 6.8: Let V,U and W be vector spaces over K. Let FF” be linear 
mappings from V into U and let G,G’ be linear mappings from U into W; and let 
kek, ‘Then: (i) Go(F +l’) = Gok +GoF’; (ii) (G+G)oF = GoF +G'oF; 
(iii) kK(GoF) = (kG)oF = Go(kF). 


(i) For every vEV, 


142 LINEAR MAPPINGS (CHAP. 6 


(Go(F+F’))\(v) = G(F+F)(v)) = GF) + F’()) 
= G(F(v))+G(F"(v)) = (G°F)(v) + (4°oF)(v) = (G°oF + GoF’)\(v) 
Since (Go(F + F’)(v) = (GOF + GoF’\(v) for every vE€V, GO(F +F’) = Gok + Gor’, 
(ii) For every vEV, 
(G+G@)coF\v) = (4+G)(F(v)) = GF(e)) + G(F(r)) 
(GoF)(v) + (@oF)(v) = (Gok + G'oF)(v) 
Since ((G + G’)°OF)(v) = (G°0F + GoF’)(v) for every vEV, (G+ G’))oF = GoFt+ G' oF, 


II 


(iii) For every v EV, 
(K(GoF))(v) = k(GeF\(v) = kG(F(x))) = (kG)(F(v)) = (kG F)(v) 


and (k(@°F))(v) = KGeF)(v) = kK(G(F(v))) = G(kF(v)) = G(kF)~)) = (GokF)(v) 


Accordingly, k(G°oF) = (kG)oF = Go(kF). (We emphasize that two mappings are shown to 
be equal by showing that they assign the same image to each point in the domain.) 


6.34. Let F:V>U and G:U>W be linear. Hence (GoF):V->W is linear. Show that 
(i) rank (GoF) = rank G, (ii) rank (GoF) = rank F. 
(i) Since F(V) CU, we also have G(F(V))C G(U) and so dimG(F(V)) = dimG(U). Then 
rank (G°F) = dim ((G°F)(V)) = dim (G(F(V))) = dim G(U) = rank G 
(ii) By Theorem 6.4, dim (G(F(V))) = dim F(V).. Hence 
rank (G°F) = dim ((G°F)(V)) = dim (G(F(V))) = dim F(V) = rank F 


ALGEBRA OF LINEAR OPERATORS 
6.35. Let S and T be the linear operators on R? defined by S(x,y) =(y,x%) and T(x,y)= 
(0,z). Find formulas defining the operators S+T,2S—3T, ST, TS, S? and.T?. 
(S+T)(x,y) = S(x,y) + T(x,y) = (y,x#) + (0,2) = (y, 2a). 
(2S — 3T)(%,y) = 28(x%,y) — 3T(x,y) = 2(y,x) — 3(0,”) = (2y,—«). 
(ST)(z,y) = S(T(#,y)) = S(0,x%) = (a, 0). 
(TS)(x,y) = T(S(a,y)) = Tly,#) = (0,y). 
S%(a,y) = S(S(a,y)) = S(y,x) = (a,y). Note S2 =I, the identity mapping. 
T*(a,y) = T(T(x,y)) = T(0,2) = (0,0). Note T2=0, the zero mapping. 


6.36. Let T be the linear operator on R? defined by 
(3,1) .=-(2,—4)e cand. Tay je (0,2) (1) 
(By Theorem 6.2, such a linear operator exists and is unique.) Find T(a,b). In 
particular, find T(7, 4). 
First write (a, 6) as a linear combination of (3,1) and (1,1) using unknown scalars x and Y: 
(a, b) = %(3, 1) + y(1, 1) (2) 
Oe a= OS @ 


Hence (a,b) = (3x, 2) + (y,y) = (@at+y,a+y) and so 
x+ry = b 


Solving for « and y in terms of a and b, 
o = ja—4b and y-= —la + 3 (3) 
Now using (2), (2) and (8), 
T(a, 6) = -#7T(8,1) + yT(,1) = «(2,—4) + y(0, 2) 
= (2%, —4%) + (0, 2y) = (2a, -4e+2y) = (a—b, 5b — 3a) 
Thus (0; 4) = (4, 20 — 20) 88) 


CHAP. 6} LINEAR MAPPINGS 143 


6.37. 


6.38. 


6.39. 


6.40. 


Let T be the operator on R? defined by T(x, y, 2) = (2a, 4a — y, 2a + 8y —2). (i) Show 
that T is invertible. (ii) Find a formula for T-!. 


(i) The kernel W of T is the set of all (w, y, z) such that T(a, y,z) = (0,0, 0), ie, 
T(x, y, 2) = (2x, 4% —y, 2e+3y—z) = (0,0, 0) 
Thus W is the solution space of the homogeneous system 
Ze =(0), Ge =p = () 2¢ + 38y—2 = 0 
which has only the trivial solution (0, 0, 0). Thus W = {0}; hence 7 is nonsingular and so by 


Theorem 6.9 is invertible. 


(ii) Let (7, s, t) be the image of (x, y, z) under T; then (x, y, 2) is the image of (r, s, t) under T~1!: 
T(x, y, 2) = (r,s, t) and T-\(r, 8, t) = (a, y, 2). We will find the values of x, y and z in terms 
of r, s and t, and then substitute in the above formula for 7-1. From 


T(x, y, 2) = (2, 4a—y, 2e+ 8y—z) = (r,8, t) 
we find «=47r, y= 2r—s, z= 7r—8s—t. Thus T~! is given by 


LE SEN GR 5 (Oy (Wie tice Beal fo at) 


Let V be of finite dimension and let T be a linear operator on V. Recall that 7 is” 
invertible if and only if T is nonsingular or one-to-one. Show that T is invertible if 
and only if T is onto. 


By Theorem 6.4, dimV = dim(Im 7) + dim(Ker 7). Hence the following statements are 
equivalent: (i) 7>-is onto, (ii) ImZ = V, ' (ii) dim(ImZ) = <‘dimV, (iv) dim(Ker7) = 0, 
(v) Ker T= {0}, (vi) T is nonsingular, (vii) T is invertible. 


Let V be of finite dimension and let T be a linear operator on V for which TS =I, 
for some operator S on V. (We call S a right inverse of T.) (i) Show that T is 
invertible. (ii) Show that S=T~?. (iii) Give an example showing that the above 
need not hold if V is of infinite dimension. 


(i) Let dimV=vn. By the preceding problem, T is invertible if and only if T is onto; hence T 
is invertible if and only if rank 7=n. We have n = rank = rank TS = rank T = n. 
Hence rank T = n and T is invertible. 


(ijetit at pt 7. Phen S =f S401 7)S.= TAS) = Tha Te. 


(iii) Let V be the space of polynomials in t over K; say, p(t) = a + a,t+ Got? + ---+a,t. Let T 
and S be the operators on V defined by 


T(p(t)) = Ota, + agt+-->+a,t—-! and S(p(t)) = aot + ay +--+ + a,trt? 


We have (TS)(p(t)) = T(S(p(t))) = Tat + ay + +++ + a,t"*1) 
=tayp haya = dt — pit) 


II 


and so TS = 1, the identity mapping. On the other hand, if k © K and k #0, then (ST)(k) 
S(T(k)) = S(0) =0#k. Accordingly, ST #1. 


I| 


Let S and T be the linear operators on R? defined by S(a,y) = (0,2) and T(x, y) 
(x,0). Show that TS=0 but ST~0. Also show that fA be 

(TS)(x, y) = T(S(a, y)) = T(0, x) = (0, 0). Since TS assigns 0 = (0,0) to every (x,y) € R2, it 
is the zero mapping: TS = 0. 

(ST)(x, y) = S(T (x, y)) = S(a, 0) = (0,x). For example, (ST)(4,2) = (0,4). Thus ST #0, since 
it does not assign 0 = (0,0) to every element of R?. 


For any (a,y) GR, T2«,y) = T(T(#,y)) = T(x, 0) = (w,0) = T(x, y). Hence Hee he 


144 LINEAR MAPPINGS | [CHAP. 6 


MISCELLANEOUS PROBLEMS 


6.41. Let {e1,é2,es} be a basis of V and {f1,f2} a basis of U. Let T:V~>U_ be linear. 
Furthermore, suppose 


T(é1) = aif: si of ay by Gl 
T(es) = dif + bof and A= 
T (és) = ¢if1 + Cofe 


Show that, for any v EV, A[v]. = [T(v)]¢ where the vectors in K? and K® are written 
as column vectors. 


Suppose v = kye,+ koe, +kgeg; then [v], = | kp}. Also, 


T(v) = kT (ey) + koT (eg) + k3T (es) 
= ky(ayfy + Gofe) + ko(byfy + bofe) + kgleyhy + Cafe) 
= (ayky + bik + eyks)fy + (dgky + bok + egks)fe 


ak, + bk, + ek 
Accordingly, [T(v)], Gs 5 age, fe ae 


Computing, we obtain 


k 
n G17 04 eh ss ayky + byky + oo ITo)] 
OF it & by Cy ie Ayoky = boks se Cokes f 

3 


6.42. Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT 
is singular. Hence T is singular if and only if —T is singular. 


Suppose T is singular. Then T7(v) =0 for some vector v #0. Hence (kT)(v) = kT(v) = k0 = 0 
and so kT is singular. 


Now suppose kT is singular. Then (kT)(w) =0 for some vector w+0; hence T(kw) = 
kT(w) = (kT)(w) = 0. But K~O and w+0 implies kw #0; thus T is also singular. 


6.43. Let HE be a linear operator on V for which #?=£E. (Such an operator is termed a 
projection.) Let U be the image of H and W the kernel. Show that: (i) if w€U, 
then E(u) =u, i.e. FE is the identity map on U; (ii) if H ~J, then EF is singular, i.e. 
E(v) =0 for some v #0; (iii) V = UW. 

(i) If w&U, the image of FE, then E(v) =u for some v€V. Hence using E2 = E, we have 
u = Ev) = E%X(v) = E(E(v)) = E(u) 
(ii) If HAT then, for some vE€V, H(v) =u where v#u. By (i), E(u) =u. Thus 
E@w—u) = Hv) — BY) => a — a= 0 where v—u+0 
(iii) We first show that V = U+ W. Let ve V. Set w=E(v) and w=v—E(v). Then 
vy = E(v) + vy — Ev) = wtw 
By definition, w= H(v) € U, the image of EH. We now show that w€ W, the kernel of EZ: 
E(w) = E(v—E(v)) = E(v) — EXv) = E(v) — E(v) = 0 
and thus we W. Hence V = U+ W. 


We next show that UN W = {0}. Let v€ UNW. Since v EU, E(v) =v by (i). Since 
vEW, E(v)=0. Thus v= E(v) =0 andso UNW = {0}. 


The above two properties imply that V = UQ@ W. 


CHAP. 6] LINEAR MAPPINGS 145 


6.44. Show that a square matrix A is invertible if and only if it is nonsingular. (Compare 


with Theorem 6.9, page 130.) 


Recall that A is invertible if and only if A is row equivalent to the identity matrix J. Thus the 
following statements are equivalent: (i) A is invertible. (ii) A and I are row equivalent. (iii) The 
equations AX =0 and IX =0 have the same solution space. (iv) AX = 0 has only the zero solu- 
tion. (v) A is nonsingular. 


Supplementary Problems 


MAPPINGS 


6.45. 


6.46. 


6.47. 


6.48. 


6.49. 


2. 


03. 


State whether each diagram defines a mapping from {1, 2, 3} into {4, 5, 6}. 


Lf tee 


7 


Define each of the following mappings f{:R—-R by a formula: 
(i) © To each number let f assign its square plus 3. 
(ii) To each number let f assign its cube plus twice the number. 


(iii) To each number = 8 let f assign the number squared, and to each number <8 let f assign 
the number —2. 
Let f:R->R be defined by f(x) = x2—4x”+8. Find (i) f(4), (ii) f(—8), ii) f(y — 2a), (iv) f(a — 2). 


Determine the number of different mappings from {a,b} into {1, 2, 3}. 


Let the mapping g assign to each name in the set {Betty, Martin, David, Alan, Rebecca} the number 
of different letters needed to spell the name. Find (i) the graph of g, (ii) the image of g. 


Sketch the graph of each mapping: (i) f(z) = d«e—1, (ii) GA) eA 


The mappings, f:A—>B, g:B>A, h:C>B, F:B>C and G:A-C are illustrated in the 
diagram below. 


Determine whether each of the following defines a composition mapping and, if it does, find its 
domain and co-domain: (i) g°f, (ii) hof, (iii) Fof, (iv) Gof, (v) goh, (vi) hoGocg. 


Let f:R>R and g:R>R be defined by f(a) = a?2+38a+1 and g(x) =2x”—3. Find formulas 
defining the composition mappings (i) f°g, (ii) g°f, (iii) gog, (iv) fof. 


For any mapping f:A—B, show that lgpof=f=fol,. 


146 LINEAR MAPPINGS [CHAP. 6 


6.54. For each of the following mappings f:R—R find a formula for the inverse mapping: (1) f(a) = 
8a—7, (ii) f(x) = a +2. 


LINEAR MAPPINGS 
6.55. Show that the following mappings F' are linear: 
(i) F:R2—>R? defined by F(a, y) = (2u—y, #). 
(ii) F':R3—> R? defined by F(x, y, z) = (2,u+y). 
(iii) F:R->R? defined by F(x) = (2m, 3x). 
(iv) F:R2—>R2 defined by F(a, y) = (ax+ by, cu+dy) where a,b,c,dER. 


6.56. Show that the following mappings F' are not linear: 
(i) #&:R?—>R? defined by F(a, y) = (x?, y?). 
(ii) #:R3— R2 defined by F(x, y, z) = («+1, y+ 2). 
(iii) F:R->R? defined by F(a) = (a, 1). 
(iv) F:R2>R defined by F(a, y) = |u—y|. 


6.57. Let V be the vector space of polynomials in t over K. Show that the mappings T7:V—>V_ and 
S:V-V defined below are linear: 


T (ao + ayt+ IGE a,t”) = Agt = a,t? ae CS a,brtt 
S(GoamGresre anu) = 0 + a, + aot + ee sted, bite L 


6.58. Let V be the vector space of n X n matrices over K; and let M be an arbitrary matrix in V. Show 
that the first two mappings 7:V—->V are linear, but the third is not linear (unless M = 0): 
(i) T(A) = MA, (ii) T(A) = MA—AM, (iii) T(A) = M+A. 


6.59. Find T(a,b) where T:R?2-—>R® is defined by T(1,2) = (8,—1,5) and T(0,1) = (2,1,—-1). 


6.60. Find T(a, b,c) where T:R?—R is defined by 
F(a, 1, 1) = 3, T(0, £ —2) = 1 and T(0, 0, 1) = —2 


6.61. Suppose F':V-—U is linear. Show that, for any v EV, F(—v) = —F(v). 


6.62. Let W be a subspace of V. Show that the inclusion map of W into V, denoted by i1:WCV and 
defined by 7(w) = w, is linear. 


KERNEL AND IMAGE OF LINEAR MAPPINGS 

6.63. For each of the following linear mappings F’,, find a basis and the dimension of (a) its image U 
and (b) its kernel W: 
(i) F:R8 > R38 defined by F(x, y, z) = (w+ 2y, y—2, «+ 22). 
(ii) &:R?% > R2 defined by F(a, y) = (wa+y, a+ y). 
(ili) #’: R® > R2 defined by F(a, y, z) = (at+ty,y+2). 

6.64. Let V be the vector space of 2 X 2 matrices over R and let M = ( : a) =) Let] FeV Vaeabevtne 
linear map defined by F(A) = MA. Find a basis and the dimension of (i) the kernel W of F' and 


(ii) the image U of F. 
6.65. Find a linear mapping F': R? > R? whose image is generated by (1, 2,3) and (4,5, 6). 


6.66. Find a linear mapping F’:R4— R? whose kernel is generated by (1, 2,3, 4) and (Rae ate aB 


6.67. Let V be the vector space of polynomials in t over R. Let D:V—>V be the differential operator: 
D(f) = df/dt. Find the kernel and image of D. 


6.68. Let #:V—>U be linear. Show that (i) the image of any subspace of V is a subspace of U and 
(ii) the preimage of any subspace of U is a subspace of V. 


CHAP. 6] LINEAR MAPPINGS 147 


6.69. Each of the following matrices determines a linear map from R# into R3: 


Te yom er Ogee (en One oT 
GA Ore peor CN Fat Sead), MOOG Sees itt 
(ESE Nye ats 2 SSO 


Find a basis and the dimension of the image U and the kernel W of each map. 


6.705 leet. PC = C be the conjugate mapping on the complex field C. That is, T(z) =Z where z€C, 
GR YMG 61) = Get where a,b ER. (i) Show that 7 is not linear if C is viewed as a vector 
space over itself. (ii) Show that T is linear if C is viewed as a vector space over the real field R. 


OPERATIONS WITH LINEAR MAPPINGS 


6.71. Let F:R? > R? and G:R3—> R2 be defined by F(x, y,z) = (y,2+z2) and G(z,y,z) = (22,0) 
Find formulas defining the mappings F + G and 3F — 2G. 


6.72. Let H:R?—>R? be defined by H(a,y) = (y,2x). Using the mappings F and G in the preceding 
problem, find formulas defining the mappings: (i) HoF and HoG, (ii) FoH and GoH, 
(iii) Ho(F + G) and Hof + HoG. 


6.73. Show that the following mappings F’, G and H are linearly independent: 
(i) F,G,H € Hom (R?2, R?) defined by 
F(x, y) = (a, 2y), G(x, y) =(y, x+y), H(a, y) = (0, x). 
(ii) F,G,H € Hom (R3, R) defined by 
JES Da) = OS paeray CAC i wp ry) —— ome Jal Gi Wp cA ei oy oe 


6.74. For F,G@€Hom(V, U), show that rank (F + G) = rank F + rank G. (Here V has finite 
dimension.) 


6.75. Let F:V>U and G:U-V be linear. Show that if F and G are nonsingular then GoF is 
nonsingular. Give an example where GF is nonsingular but G is not. 


6.76. Prove that Hom(V, U) does satisfy all the required axioms of a vector space. That is, prove 
Theorem 6.6, page 128. 


ALGEBRA OF LINEAR OPERATORS 


6.77. Let S and T be the linear operators on R? defined by S(x,y) = (x+y,0) and T(x, y) = (—y, «). 
Find formulas defining the operators S+ 7, 5S —3T, ST, TS, S2 and T?. 


6.78. Let T be the linear operator on R2 defined by T(x,y) = (w+ 2y,3a+4y). Find p(T) where 
p(t) = 2 —5t—2. 


6.79. Show that each of the following operators T on R® is invertible, and find a formula for Tos 
(i) T(x, y, 2) = (a—8y— 22, y—4z, 2), (ii) T(x, y, 2) = (w@ +2, %—2, y). 


6.80. Suppose S and T are linear operators on V and that S is nonsingular. Assume V has finite dimen- 
sion. Show that rank (ST) = rank (TS) = rank T. 


6.81. Suppose V = U@W. Let EL, and EH, be the linear operators on V desned by £,(v) =u, 
E,(v) = w, where v=u+w, weU, we W. Show that: (i) Hj; =E, and EH, =£s5, ie. that EH, 
and FE, are “projections”; (ii) E, +E, =I, the identity mapping; (iii) H,B,=0 and EB, = 0. 


6.82. Let FE, and EZ, be linear operators on V satisfying (i), (ii) and (iii) of Problem 6.81. Show that V 
is the direct sum of the image of FE, and the image of Ey: V = ImE, ® Im Ey. 


6.83. Show that if the linear operators S and T are invertible, then ST is invertible and (ST)-!= T'S =f; 


148 


6.84. 


LINEAR MAPPINGS [CHAP. 6 


Let V have finite dimension, and let 7 be a linear operator on V such that rank (T?) = rank T. 
Show that Ker TN ImT = {0}. 


MISCELLANEOUS PROBLEMS 


6.85. 


6.63. 


Suppose 7:K"—>K™ jis a linear mapping. Let {e;,...,@,} be the usual basis of A” and let A be 
the m X n matrix whose columns are the vectors T(e;),..-, (én) respectively. Show that, for every 
vector v © K”, T(v) = Av, where v is written as a column vector. 


Suppose F:V—>U is linear and k is a nonzero scalar. Show that the maps F’ and kF have the 
same kernel and the same image. 


Show that if F:V—-U is onto, then dimU=dimV. Determine all linear maps T: R® > Rt 
which are onto. 


Find those theorems of Chapter 3 which prove that the space of n-square matrices over K is an 
associative algebra over K. 


Let T:V—>U_ be linear and let W be a subspace of V. The restriction of T to W is the map 
Tw:W-U defined by Ty(w) = T(w), for every w€W. Prove the following. (i) Tw is linear. 
(ii) Ker Ty = KerTOW. (iii) ImTw = T(W). 

Two operators S,7 © A(V) are said to be similar if there exists an invertible operator P © A(V) 


for which S = P~!1T7P. Prove the following. (i) Similarity of operators is an equivalence relation. 
(ii) Similar operators have the same rank (when V has finite dimension). 


Answers to Supplementary Problems 


(i) No, (ii) Yes, (iii) No. 

(i) f(w) = a2 +3, (ii) fe) = 23 +20, (iii) f(x) = ‘a Eee 
Hh abe Yin SIG; 

(i) 8, (ii) 24, (iii) y2—4ay + 4e2—4y + 8043, (iv) 22 —80¢ +15. 


Nine. 


(i)  {(Betty, 4), (Martin, 6), (David, 4), (Alan, 3), (Rebecca, 5)}. 
(ii) Image of g = {3) 4; 5, 6}. 


(i) (g°f):A >A, (ii) No, (iii) (Foef):A>C, (iv) No, (v) (g°h):C >A, (vi) (hoGog):B>B. 


(i) (f°g)(a) = 4a2—-6e+1 (iii) (9° g)(a) = 4a —9 
(ii) (g°f)(~) = 242+ 624-1 (iv) (f° f)(~) = 24+ 6a3 + 1442+ 15¢4+5 


(i) fw) = (w@+1)/3, (ii) fw) = Va —2. 

T(a, b) = (—a+ 2b, —3a+ b, Ta — b). 

T(a, b, c) = 8a — 3b — 2c. 

F(v) + F(-v) = F(v + (-v)) = F(0) = 0; hence F(—v) = —F(). 


(i) (a) {(1, 0,1), (0,1, —2)}, dim U = 2; (b) {(2,—1,—1)}, dim W = 1. 
Gi). »(@) {@4)3, dim U-— 15+ (6) (5-1) dim W j=<12 
(iii) (a) {(1,0), (0,1)}, dim U = 2; (6) {(1,—1,1)}, dim W = 1. 


CHAP. 6] LINEAR MAPPINGS 149 


: Ln () Oe, 
6.64. (i) (; ry ; G i} basis of Ker F; dim (Ker F) = 2. 


(ii) ein he basis of ImF; dim (Im F) = 
BO LG ( i asis of Im F; dim (Im F) = 2. 


6.65. F(x, y,z) = («+ 4y, 2a + By, 3” + 6y). 
6.66. F(x, y,z2,w) = (wty—z, 2e+y—w, 0). 
6.67. The kernel of D is the set of constant polynomials. The image of D is the entire space V. 
6.69. (i) (a) {(1,2,1), (0,1,1)} basis of Im A; dim (Im 4A) = 2. 
(b) {(4, —2, —5, 0), (1,—8,0,5)} basis of Ker A; dim (Ker A) = 2. 
(ii) (a) ImB=R3; (6) {(—1,2/3,1,1)} basis of Ker B; dim (Ker B) = 1. 


671.. (F -—- G)@, y, 2) = Y+2z, 2a—y+z2), (8F — 2G)(«,y,2) = By—4z, «+ 2y + 32). 


6.72. (i) (HoF)(«, y, 2) = (w«+z2,2y), (H°G)(a, y, z) = (a—y, 4z). (ii) Not defined. 
(iii) ete: (Eite=t=G)))) (aes apne) (EO NETLOG \ (oa, ce (20 — Yt et ae) 5 


6.77. (S+ T)(x%, y) = (x, x) (ST)(z, y) = (a — y, 0) 
(AS == BANG 0) == (CHS tp 69) (TS)(a, y) (0, «+ y) 


S2(a, y) = (a+ y, 0); note that S?=S. 
T2(x, y) = (—a, —y); note that T2+I=0, hence T is a zero of x? +1. 


6.78. p(T) = 0. 
6.79. (i) T-U(r,s, t) = (14t+3str, 4t+s, 2), (ii) T-Ur, 8, t) = (4r+ 4s, t, dr — 4). 


6.87. There are no linear maps from R? into R* which are onto. 


Chapter 7 


Matrices and Linear Operators 


INTRODUCTION 
Suppose {é1, ...,@n} is a basis of a vector space V over a field K and, for v € V, suppose 
V = Q1€1 + A2x€2 + ++++nen. Then the coordinate vector of v relative to {ei}, which we write 
as a column vector unless otherwise specified or implied, is 
a1 
a2 
[laa 
An 


Recall that the mapping v } [v]., determined by the basis {ei}, is an isomorphism from V 
onto the space K”. 

In this chapter we show that there is also an isomorphism, determined by the basis 
{e:}, from the algebra A(V) of linear operators on V onto the algebra c4 of n-square matrices 
over K,. 

A similar result also holds for linear mappings F': V > U, from one space into another. 


MATRIX REPRESENTATION OF A LINEAR OPERATOR 


Let T be a linear operator on a vector space V over a field K and suppose {é1,...,é@n} is 
a basis of V. Now T(é:),...,7T (én) are vectors in V and so each is a linear combination of 
the elements of the basis {e;}: 
T(é1) = dis€1 + Azz + +++ + Ain€n 
T(é2) = A161 + Ao2€2 + >= = + Aonen 
T (én) = Gniéi + Anz@2 —- >>> + Onn€n 


The following definition applies. 


Definition: The transpose of the above matrix of coefficients, denoted by [7]. or [T], is 
called the matrix representation of T relative to the basis {e:} or simply the 
matrix of T in the basis {e;}: 


Qi1 A214 Ani 

Qi2 Ape Anz 
ies = 

Qin Qe2n Ann 


Example 7.1: Let V be the vector space of polynomials in t over R of degree = 3, and let D:V > V 
be the differential operator defined by D(p(t)) = d(p(t))/dt. We compute the matrix 
of D in the basis {1, t, t?, 3}. We have: 


D(1) = 0 = 0+ O¢ + Of + 08 


D(t) = 1-= 1+ 0¢ + O82 + 088 
D(t?2) = 2t = 0+ 2t + Of + 088 
D(8) = 32 = 0+ 0¢ + 32 + 088 


CHAP. 7} MATRICES AND LINEAR OPERATORS 151 


Accordingly, 


ie 


ORO. OS 
oo Oo KF 
OS) INS 
ao wo co 


Example 7.2: Let T be the linear operator on R2 defined by T(x, y) = (4a —2y, 2a+y). We com- 
pute the matrix of T in the basis {f; = (1,1), fo = (—-1,0)}. We have 


Mf) = T(1,1) = (2,8) = 3(1,1) + (-1,0) = 3f, + fy 
E2) = T(-1,.0)" =" (= 4, 2) (=: 205.1) 4 2(=1, 0) = 27, + Of, 


Accordingly, [7]; = a 3): 


Remark: Recall that any n-square matrix A over K defines a linear operator on K” by 
the map v} Av (where v is written as a column vector). We show (Problem 
7.7) that the matrix representation of this operator is precisely the matrix A 
if we use the usual basis of K”. 


Our first theorem tells us that the “action” of an operator 7 on a vector v is preserved 
by its matrix representation: 
Theorem 7.1: Let {é1,...,@n} be a basis of V and let 7 be any operator on V. Then, for 
any vector veEV, [Tlelv\e = (T(v)le. 


That is, if we multiply the coordinate vector of v by the matrix representation of T, 
then we obtain the coordinate vector of T(v). 


Example 7.3: | Consider the differential operator D:V—-V in Example 7.1. Let 
fe) == Oh ae We ae GE ae ChE and so D(p(t)) = 6 + 2ct + 3dt? 


Hence, relative to the basis {1, tf, t?, t*}, 


a b 
b 2c 
[p(t)] = F and [D(p(t))] = 3d 
d 0 


We show that Theorem 7.1 does hold here: 


OF OR Ota, b 
0 2 0|/ b 2 

ieOhe— Usenet = (gala se) 
O02 0 20.).\0 0 


Consider the linear operator 7:R?—R? in Example 7.2: T(x, y) = (4% —2y, 2% + y). 
ete 7—1(Osn) seben 
Doi (be ho= Vl, A) ea(—-1y 0), = Thy 2fo 


Toy (Gul 7) =. 17 Lee 12,0) S17, df 


Example 7.4: 


where f,; = (1,1) and f, = (-1, 0). Hence, relative to the basis {f,, fo}, 


Pole () and [T(v)|, = oe 


Using the matrix [T]; in Example 7.2, we verify that Theorem 7.1 holds here: 


mye = G3) G) = (i) = Po 


152 MATRICES AND LINEAR OPERATORS [CHAP. 7 


Now we have associated a matrix [7]. to each T in A(V), the algebra of linear operators 
on V. By our first theorem the action of an individual operator T is preserved by this 
representation. The next two theorems tell us that the three basic operations with these 
operators 
(i) addition, (ii) scalar multiplication, (iii) composition 


are also preserved. 


Theorem 7.2: Let {e1:,...,é@n} be a basis of V over K, and let c4 be the algebra of 
n-square matrices over K. Then the mapping T'[T]. is a vector space 
isomorphism from A(V) onto c4. That is, the mapping is one-one and onto 
and, for any S,T € A(V) and any kEk, 


[T+ Sle = [Tle + [Sle > and [kT]e = k[T]e 


Theorem 7.3: For any operators S,T € A(V), [ST]e = [S]e[T]e. 


We illustrate the above theorems in the case dimV=2. Suppose {é1, 2} is a basis of 
V, and T and S are operators on V for which 


T(é1) = ye: + Ac€e S(é1) = €1€1 + Cr2€2 
T (é2) = 6b1e1 + be2é2 ‘ S(é2) = dé; + deée 
(i ~ Os Cr ai 
Th T G Ss —<—_ 
en [T] e a and [S] ( x) 
Now we have (T+S)(e:) = T(e1) + S(e1) = a1€1 + A2€2 + C1€1 + Cr€2 


— (a1 ar C1)e1 ate (2 ay C2) €2 


(T + S)(é€2) = T (é2) i S(é2) = b1e1 Be bes ap die ae d2€2 
= (b1 Si di)ey == (b2 aid dz)€2 


Thus 
@y+'e; “bi dy Qi by C1 a 

T+S e = = _— 

(. +¢2 be+ a c: ( Plc Ble 


Also, for k € K, we have 
(kT)(e:) = kT (ex) 
(kT) (2) = kT (é2) = k(b1€: + b2eé2) = kbie1 + kboes 


Hence Ee tees: te 5) = KS. i = bbl Rl. 


= k(aié:+ a2€2) = kaye + kazer 


kas kbo a2 be 


Finally, we have 
(ST)(eé:1) = S(T(e:)) = S(a1€1 + G22) = a1S(é1) + a2S(€2) 


= 01(C1€1 + C2€2) + e(die: + d2e2) 
= (A1C1 + G2di1)e1 + (1€2 + Ged2)e2 
(ST)(e2) = S(T(e2)) = S(bie1 + bee2) = b1S(e1) + b2S(e2) 
= 0i(C1€1 + C2€2) + bo(dies + dee) 
= (b1¢1 + bedi)e; + (b1C2 + bode) ee 
Accordingly, 


- QiC1 + 2d; =1€1 + bod, Ci od Qiao 
[ST]. = aoe ae eee re maetieeee Shean! 
QiC2 + Aed2 bies + bode Co de Q2 be 


CHAP. 7] MATRICES AND LINEAR OPERATORS 153 


CHANGE OF BASIS 


We have shown that we can represent vectors by n-tuples (column vectors) and linear 
operators by matrices once we have selected a basis. We ask the following natural question: 


How does our representation change if we select another basis? In order to answer this 
question, we first need a definition. 


Definition: Let {é1,...,e:} be a basis of V and let {fi,...,fn} be another basis. Suppose 


fr = Qiré1 + Aize2 + +++ + Ain€n 
fo = Qe1€1 + Aoz€g + +++ + Aon€n 
ep = GAni€i + Ans€s + +++ + Ann€n 


Then the transpose P of the above matrix of coefficients is termed the transi- 
tion matrix from the “old” basis {e:} to the “new” basis {fi}: 


Qii1 Qa <hr. Ani 
poe Qi2 Ao An2 
Qin Qaon Ann 


We comment that since the vectors fi,...,fn are linearly independent, the matrix P is 
invertible (Problem 5.47). In fact, its inverse P~! is the transition matrix from the basis 
{fi} back to the basis {e;}. 


Example 7.5: Consider the following two bases of R?: 
Let = AL, O) ney (0,0) ) ands if¢ — (1,4), 7f5 = (—1,0)) 
Then re = Gaaly = C0) =F (Oly = Gh ar Ge 
foa—(—1 0) (00) 005) en i 0e> 
Hence the transition matrix P from the basis {e;} to the basis {f;} is 
Pan) 
We also have Ci =e 0) OL) 0) 107 = fo 
Coe (0,1) IS) (S10) fi fo 


Hence the transition matrix Q from the basis {f;} back to the basis {e;} is 


On, 2 \ a a 
= f / = her { 
: es :) (rom As finan 


j 
~ \ 
Observe that P and Q are inverses: \ fork / 
ee Ona: 0 
= — = 7] 
ae 6, sles 2) e 2 
We now show how coordinate vectors are affected by a change of basis. 


Theorem 7.4: Let P be the transition matrix from a basis {ei} to a basis { fi} in a vector 
space V. Then, for any vector v EV, Plv]; = [v]e. Hence |v]s = Priole: 


We emphasize that even though P is called the transition matrix from the old basis 
{ei} to the new basis {fi}, its effect is to transform the coordinates of a vector in the new 
basis {fi} back to the coordinates in the old basis {¢:}. 


pee 


154 MATRICES AND LINEAR OPERATORS [CHAP. 7 


We illustrate the above theorem in the case dimV=8. Suppose P is the transition 
matrix from a basis {é1, 2, €3} of V to a basis {f1, fs, fs} of V; say, 


fi = G1€1 + A2€2 + As€3 a1 bi G4 
fo = b1é1 + bees + bses . Henee* =P = Gs «Os Cs 
fs = €1€1 + C2€2 + Cz€3 a3 bs C3 


Now suppose v € V and, say, v=fifithkefet+ksfs. Then, substituting for the f; from 
above, we obtain 


UC ky (a1e1 + A2€2+ ses) ++ Kkeo(biex + be@o + b3és) + kes(€1é1 + C2€2 + C33) 
= (aiki + bike + crks)e1 + (Gok: + bake + Coks)e2 + (Aski + bsk2 + caks)€3 


Thus ky dik + bike + e1ks 
[vy = ke and [ve = dok; + beke + Cok 
ks ask: + bske + csks 
Accordingly, ay b1 C1 ki ask, at bike + C1k3 
Ploly => (G3, 02 2 |lv hea — dok; + boke + Coks = [vk 
a3 b3 ¢s ks ask; + bsk2 + esks 


Also, multiplying the above equation by P~1, we have 
Polvo SPOT P os el els 
Example 7.6: Let v = (a,b) € R%. Then, for the bases of R2 in the preceding example, 
y = (a,b) = a(1,0) + 6(0,1) = ae, + be, 
9 = (a,b) = b(1,1) + (6=a)(—1,0) = bf, + © —a)f, 


Hence [le = () ond lias (ta) 


By the preceding example, the transition matrix P from {e;} to {f;} and its inverse 


P~! are given by 
1 -1 val 
eae / 2S 
G = or A 


We verify the result of Theorem 7.4: 


Poly = (1 o)(s—a) = (3) = Ue 
rin = (200) (8) = 


The next theorem shows how matrix representations of linear operators are affected 
by a change of basis. 


Theorem 7.5: Let P be the transition matrix from a basis {e;} to a basis {fi} in a vector 
space V. Then for any linear operator T on V, [T]; = P-1[T]-P. 


Example 7.7: Let T be the linear operator on R? defined by T(x, y) = (4a —2y, 2a” + y). Then for 
the bases of R? in Example 7.5, we have 
T(e;) = T(1,0) = (4,2) = 4(1,0) + 2(0,1) = 4e, + 2e, 
Te.) = T(0,1) = (—2,1)-= —2(1,0) + (0,1) = —2e, + 5 


Accordingly, (as G oe 


CHAP. 7] MATRICES AND LINEAR OPERATORS 155 


We compute [7], using Theorem 7.5: 


tea Ge) accu) io a 


Note that this agrees with the derivation of [T]; in Example 7.2. 


Remark: Suppose P a (aij) is any n-square invertible matrix over a field K. Now if 
{é1,..., én} is a basis of a vector space V over K, then the n vectors 


fi = Gies + Arié2 + +++ + Ani€n, (a eile) 


are linearly independent (Problem 5.47) and so form another basis of V. 
Furthermore, P is the transition matrix from the basis {ei} to the basis {fi}. 
Accordingly, if A is any matrix representation of a linear operator T on V 
then the matrix B= P~-1AP ig also a matrix representation of T. 


SIMILARITY 


Suppose A and B are square matrices for which there exists an invertible matrix P 
such that B=P AP. Then B is said to be similar to A or is said to be obtained from A 
by a similarity transformation. We show (Problem 7.22) that similarity of matrices is an 
equivalence relation. Thus by Theorem 7.5 and the above remark, we have the following 
basic result. 


Theorem 7.6: Two matrices A and B represent the same linear operator T if and only if 
they are similar to each other. 


That is, all the matrix representations of the linear operator T form an equivalence 
class of similar matrices. 

A linear operator T is said to be diagonalizable if for some basis {e:} it is represented 
by a diagonal matrix; the basis {ei} is then said to diagonalize T. The preceding theorem 
gives us the following result. 

Theorem 7.7: Let A be a matrix representation of a linear operator T. Then T is 
diagonalizable if and only if there exists an invertible matrix P such that 
PAP is a diagonal matrix. 


That is, T is diagonalizable if and only if its matrix representation can be diagonalized 
by a similarity transformation. 
We emphasize that not every operator is diagonalizable. However, we will show 


(Chapter 10) that every operator T can be represented by certain “standard” matrices 
called its normal or canonical forms. We comment now that that discussion will require 


some theory of fields, polynomials and determinants. 

Now suppose f is a function on square matrices which assigns the same value to similar 
matrices; that is, f(A) = f(B) whenever A is similar to B. Then f induces a function, also 
denoted by f, on linear operators T in the following natural way: f(T) =f([T]e), where {e:} 
is any basis. The function is well-defined by the preceding theorem. 

The determinant is perhaps the most important example of the above type of functions. 
Another important example follows. 

The trace of a square matrix A = (a;;), written tr (A), is defined to be the sum of 
its diagonal elements: 


Example 7.8: 
tr (A) ane ae Aq + tisrenicts Ann 


We show (Problem 7.22) that similar matrices have the same trace. Thus we can 
speak of the trace of a linear operator T; it is the trace of any one of its matrix 


representations: tr (7) = tr ([T],). 


156 MATRICES AND LINEAR OPERATORS [CHAP. 7 


MATRICES AND LINEAR MAPPINGS 


We now consider the general case of linear mappings from one space into another. 
Let V and U be vector spaces over the same field K and, say, dimV=m and dimU=n. 
Furthermore, let {é1,...,@m} and {f1,...,fn} be arbitrary but fixed bases of V and U 
respectively. 

Suppose F:V->U is a linear mapping. Then the vectors F(é1),...,F(@m) belong to 
U and so each is a linear combination of the fi: 

Fei) = dufi + @nfe =< *- + Ginfa 


F(€2) "= Gasfi Osefe. + + Gonfn 


ai 9, 0. @, 10 6.0 16) 06 (0) 0: oe 0) 6 16 6 0 @ ee 6 6, 6 16: sl ellejie: 16 


F'(€m) — Amif 1 a Amef2 qP Boo SF Cann 


The transpose of the above matrix of coefficients, denoted by [F]f is called the matrix 
representation of F' relative to the bases {e:} and {fi}, or the matrix of F’ in the bases {e;} 
and {fi}: 


11 + Qai Bao b Am1 
Q12 A22 Am2 

if = 
ler eee oalg oe eee 
Qin Aan Amn 


The following theorems apply. 


Theorem 7.8: For any vector v EV, [F]/[v]e = [F(v)]p. 


That is, multiplying the coordinate vector of v in the basis {ei} by the matrix [F']!, we 
obtain the coordinate vector of Fv) in the basis {fi}. 


Theorem 7.9: The mapping FP |F']/ is an isomorphism from Hom (V, U) onto the vector 
space of n X m matrices over K. That is, the mapping is one-one and onto 
and, for any F,G <@ Hom(V,U) and any kek, 


[PSGlio = le (Gla ane os er Sac leee 


Remark: Recall that any n X m matrix A over K has been identified with the linear map- 
ping from K™ into K” given by vtAv. Now suppose V and U are vector 
spaces over K of dimensions m and n respectively, and suppose {ei} is a basis 
of V and {fi} is a basis of U. Then in view of the preceding theorem, we shall 
also identify A with the linear mapping F':V->U given by [F(v)|;=Alv]e. We 
comment that if other bases of V and U are given, then A is identified with 
another linear mapping from V into U. 


Theorem 7.10: Let {e:}, {fi} and {gi} be bases of V, U and W respectively. Let F:V>U 
and G:U->W be linear mappings. Then 


[Gee Gea. 


That is, relative to the appropriate bases, the matrix representation of the composition 
of two linear mappings is equal to the product of the matrix representations of the 
individual mappings. 


We lastly show how the matrix representation of a linear mapping F:V-—>U is affected 
when new bases are selected. 


Theorem 7.11: Let P be the transition matrix from a basis {e;} to a basis {ej} in V, and let 
Q be the transition matrix from a basis {fi} to a basis {fj} in U. Then for 
any linear mapping F':V->U, 


CHAP. 7] MATRICES AND LINEAR OPERATORS 157 


Thus in particular, 


le Oe 
i.e. when the change of basis only takes place in U; and 
[Fl = [FP 


i.e. when the change of basis only takes place in V. 


Note that Theorems 7.1, 7.2, 7.3 and 7.5 are special cases of Theorems 7.8, 7.9, 7.10 
and 7.11 respectively. 


The next theorem shows that every linear mapping from one space into another can be 
represented by a very simple matrix. 


Theorem 7.12: Let F:V->U be linear and, say, rank F = r. Then there exist bases of 
V and of U such that the matrix representation of F has the form 


ee ( vl 4) 
ed 
where I is the r-square identity matrix. We call A the normal or canonica 
form of F. 


WARNING 
As noted previously, some texts write the operator symbol T to the right of the vector 
v on which it acts, that is, 
vT insteadof T(v) 
In such texts, vectors and operators are represented by n-tuples and matrices which are the 
transposes of those appearing here. That is, if 


Vv = Keyes + keeg + +--+ + Kren 


then they write ky 
[v]e = (ki, ko, ..., hn) insteadof  [v]e = i 
Kn 
And if T(e1) =" G1€1,+ Q2€o +.°2:* + Onen 
T(€2) = bier + boég + +++ + On€n 
(Cn) ee Cet anC2ea Cpe 
then they write 
ay ae Qn di bi Cy 
Miia en eo 
- ) es | Cn An’ On Cn 


This is also true for the transition matrix from one basis to another and for matrix rep- 
resentations of linear mappings F:V>U. We comment that such texts have theorems 


which are analogous to the ones appearing here. 


158 MATRICES AND LINEAR OPERATORS (CHAP. 7 


Solved Problems 


MATRIX REPRESENTATIONS OF LINEAR OPERATORS 
7.1. Find the matrix representation of each of the following operators T on R? relative to 
the usual basis {e: = (1,0), e2 = (0,1)}: 
(i) T(x, y) = (2y, 8a—y), (ii) T(x, y) = (8a —4y, # + dy). 
Note first that if (a,b) € R2, then (a,b) = ae, + bes. 


@) 2; = 100) — 0s) Oe, 
and [Tle = ‘ 
Tes) = T(0,1) = (2,-1) = 2e;— @ 3 -1 
(ii) T(e;) = TA,0)-= (3,1) “= 8e,+ @ 3 -4 
and [T]. = : 
T(es) = T(0,1) = (—4,5) = —4e, + Bey 1.5 


7.2. Find the matrix representation of each operator T in the preceding problem relative 
to the basis {f: = (1,3), fo = (2, 5)}. 
We must first find the coordinates of an arbitrary vector (a,b) € R? with respect to the basis 


{f;}. We have 
(Gi, () OU, 8) se ED) = (ear Ave, She == 9) 


or et i And) ov OY) a0 
or po = PMo= hp ancl 7 = se = lb 
Thus GO) = C= EQ) FF Ce = Oifo 
(i) We have T(x, y) = (2y, 8x—y). Hence 
Ty) = T(1,3) = (6,0) = —30f, + 18f, —30 —48 
and [T]; = ( 
GS) = 12.5) = GU.) = 4B ae BOS 18 29 


(ii) We have T(x, y) = (8a—4y,x“+5y). Hence 
DF) = 10,3) = (9,16) =" 17 43h5 iT, = Tr) 124 
T(fo) = €(2,6) = (—14,27) = 124f, — 69f, oF; & ) 


7.3. Suppose that T is the linear operator on R* defined by 
T(x, Y, 2) = (aie + oy + azz, bix + boy + sz, 1% + Coy + C32) 


Show that the matrix of T in the usual basis {e;} is given by 


ay dog ag 
[Tle = by be bs 
C1 C2 C3 


That is, the rows of [T]e are obtained from the coefficients of x, y and z in the com- 
ponents of T(x, y, z). 

T(e;) = T(1, 0,0) = (ay, by, ¢;) = aye, + ben + ces 

T(e.) = T(0, 1,0) = (dg, bg, cy) = age, + Doey + Coes 

T(es) = T(0,0,1) = (ag, bg, cg) = agey + beep + C3€3 


Accordingly, a, Ay Qs 
(= by by bg 


Remark: This property holds for any space K” but only relative to the usual basis 
{eq = (1,-0,-4.4.3°0); eg (0, 15:05 ee 0) percent ent =n Ono win 


CHAP. 7] MATRICES AND LINEAR OPERATORS 159 


7.4. Find the matrix representation of each of the following linear operators T on R? 
relative to the usual basis {e; = (1,0,0), €2 = (0,1, 0), es = (0,0, 1)}: 
yale, Y;-2) (2% — 3y + 42, 5a —y + 2z, 4a + Ty), 


(if) T(a,y,2) = (2y+z,%—4y, 8a 


ye 
2 =e al Oe ae I 
By eroplemuisa. (1) [l[e = |be—1) “255% Gi) le = (a4? -04. 
ae re 0) 3 


II 


7.5. Let T be the linear operator on R? defined by T(x, y, 2) = (2y+2, x —Ay, 32). 
(i) Find the matrix of 7 in the basis {f: = (1,1,1), fe=(1,1,0), fs = (1,0, 0)} 
(ii) Verify that [T];[v]; = [T(v)]; for any vector v € R°. 


We must first find the coordinates of an arbitrary vector (a,b,c) € R? with respect to the basis 
{f1, fo, fg}. Write (a, b,c) as a linear combination of the f, using unknown scalars a, y and 2z: 


(G05, @) = Bays 1) sr Ol, a O) se Gk OPO) 
= (G5 War es Ose Op @) 
Set corresponding components equal to each other to obtain the system of equations 
Cateye) 0s Rar a = |p. y =e 
Solve the system for x, y and z in terms of a, b and c to find x=c, y=b—c, z=a-—b. Thus 


(a, b,c) = ef; + (b—e)fy + (a— b)fs 


Gy Shines” INCH 5) CA ae a Ce) 


T(f;) = T(1,1,1) = (8,—8,3) = 8f; — 6fo + 6f Sit! 
MGs) S UG, O) SS CeH33,3) = Bin — Gio ar Bie and [Thy =e 6-62 
RG) s— 2 E00) — (0 3) oi iaios Ts @ 15 al 
(ii) Suppose v = (a,b,c); then i 
a (a, b, c) = chy a5 (b—c)fo aP (a— b)f and so [vs = (s*.) 
On. = (9) 


Also, 
Ti@\== 9T(e5b, ce), = "(2b + ¢, &— 40, 3a) 


3a 
and so [T(v) |; = ( —2a — 4b | 
= Wohin ae = GOS ae (Ee Glo Os 
Thus 3) BS c 3a 
SR Op tO Ob se. —2o— 4b = [T(w)]s 
6 5 -1/\a—6b Hh 57 DO ae 


7.6. Let A = ( : and let 7 be the linear operator on R? defined by T(v) = Av (where 


2 
3.4 
v is written as a column vector). Find the matrix of T in each of the following bases: 
(i) {e:= (1,0), e2 = (0,1)}, ie. the usual basis; 


(i) T(ey) =(5 ao) - ie = le, + 8e, (: a 


and thus [T]e = 


res =(5 2)(2) = (2) = te . 


160 MATRICES. AND LINEAR OPERATORS [CHAP. 7 


Observe that the matrix of 7 in the usual basis is precisely the original matrix A which 
defined 7. This is not unusual. In fact, we show in the next problem that this is true for any 
matrix A when using the usual basis. 


(ii) By Problem 7.2, (a, 6) = (2b—5a)f, + (83a—b)f. Hence 


—5f, + 6f 


7.7. Recall that any n-square matrix A = (aij) may be viewed as the linear operator T on 
K" defined by T(v) = Av, where v is written as a column vector. Show that the 
matrix representation of T relative to the usual basis {e:} of K” is the matrix A, that 


—8f; ate 10f, 


is, [T]e =A. 
Gy, Ay Gin 1 Ong 
0 a 
Te Vie eae yale eee 2n a 21) = ayyey + Agyep + °°° + Onsen 
Ant Ang ann 0 Ant 
a4, Ay Gin 0 O92 
a a a a a ae a 
T (és) — Aes = 21 22 2n = 22 = 1981 + An2€9 + + + an2en 
Any Ang Ann 0 Ong 
G11 Aye Gin 0 Gin 
a a a 0 a 
T(en) a Aen, = a a4 ab = ae = Ane aS ono pace es annen 
Ani Ang ann 1 ann 
(That is, T(e,) = Ae; is the ith column of A.) Accordingly, 
Ugh aie oo os Wivy 
fe M1 Gq «++ Aon = 
[T]e = = A 
Qni ze Onn 


_ 7.8. Each of the sets (i) {1,¢,e,te‘} and (ii) {e, te**, te} is a basis of a vector space V 
of functions f:R>R. Let D be the differential operator on V, that is, D(f) = df/dt. 
Find the matrix of D in the given basis. 


(i); Seg =a = 0(1) + 0(t) + O(et) + O(tet) Mane ey 
Dy a = 1(1) + 0(t) + O(et) + O(tet) skate Jpl 22 beeen 
Det) = et = 0(1) + 0(t) + 1(et) + O(tet) Oy Or tee tT 
D(tet) = et + tet = O(1) + O(t) + 1(et) + 1(tet) CeO 0 

(ii) D(est) = 3e3t = 38(e3t) + O(te3t) + 0(t2e3#) 310 
D(te8t) = e8t + 8te3t = 1(e8t) + 3(te3t) + 0(t2e3t) and [Dien 1S 08 
D(éest) = Qteit + Zt2e3t = O(e3t) + 2(te8t) + 3(t2e3t) ONO s 


CHAP. 7] MATRICES AND LINEAR OPERATORS 161 


7.9. 


7.10. 


Prove Theorem 7.1: Suppose {€1,...,€n} is a basis of V and T is a linear operator 
on V. Then for any v EV, [T].[v]e = [T(v)]e. 


Suppose, for i=1,...,n 


? 


T(e;) = ayey + aigeg + +++ + Dinen = > G52; 
Then [T], is the n-square matrix whose jth row is 
(@1j, Go; ..., nj) (1) 
Now suppose US hier Tikes Boe he > ke; 
i= 


Writing a column vector as the transpose of a row vector, 


[vle = (ht, ke, ..., ky)? (2) 


Furthermore, using the linearity of T, 


Thus [T(v)], is the column vector whose jth entry is 
ay jky 4e Ayko 222 OnjKn (8) 
On the other hand, the jth entry of [T],[v], is obtained by multiplying the jth row of [7], by [v]., 


ie. (1) by (2). But the product of (1) and (2) is (8); hence [T],[v], and [T(v)], have the same entries. 
Thus [T].[v]e = [T(r)]e- 


Prove Theorem 7.2: Let {é1,...,é€n} be a basis of V over K, and let c4 be the algebra 
of n-square matrices over K. Then the mapping 7+ |T]. is a vector space isomor- 
phism from A(V) onto c4. That is, the mapping is one-one and onto and, for any 
S,l GA(V) and any KEK, [T+ S].-=(|T].+(Sle-and [kT]e= k[T]e. 


The mapping is one-one since, by Theorem 8.1, a linear mapping is completely determined by 
its values on a basis. The mapping is onto since each matrix M€c4 is the image of the linear 


operator m 

ile.) = = Mi; &; Dea) la eee 
I= 

where (m;;) is the transpose of the matrix M. 


Now suppose, for 7=1,...,”, 
i n 
Te) = = aye; and Sle) = = be; 
I= = 


J 
Let A and B be the matrices A = (a,;) and B= (b,;). Then [T],= At and [S], = Bt. We have, 


LOD Oe eee, n 
(T+S)(e) = Te) + Sle) = = (aj + b:;)e; 
I= 


Observe that A + B is the matrix (a;;+ },;). Accordingly, 
Sig = (AB) = Ata Be a=. {T\, + {S| 
We also have, for 1=1,...,n, 
Gere) = kT) = FS ayy = 3 rape, 
Observe that kA is the matrix (ka;;). Accordingly, 
[kT], = (kA)t = kAt = k[T), 


Thus the theorem is proved. 


162 MATRICES AND LINEAR OPERATORS [CHAP. 7 


7.11. Prove Theorem 7.3: Let {é1,...,én} be a basis of V. Then for any linear operators 
S,TEA(V), [ST]. = [S]e [T]e. 
Suppose T(e,;) = S ae; and S(e;) = > bj,@,- Let A and B be the matrices A = (a,;) and 
j=1 k= 


B = (b;,). Then [T],=At and [S],=Bt. We have 


(STVe) = S(T(e)) = s( 3 aye = 3 ase) 


4 


= = ai (2 byt ) = 3 € 
j=1 k=1 b= TNs 


n 


Recall that AB is the matrix AB = (c;,) where ¢, = > a,jb;,. Accordingly, 
=i 


f= 
n 
=i 


0450 5x ) Ck 


[ST]. = (AB)t = BtYAt = [S]. [Tle 


CHANGE OF BASIS, SIMILAR MATRICES 


7.12. Consider these bases of R®: {e: = (1,0), ee = (0,1)} and {f1 = (1,3), fe = (2,5)}. 
(i) Find the transition matrix P from {ei} to {fi}. (ii) Find the transition matrix Q 
from {fi} to {ei}. (iii) Verify that Q=P-1. (iv) Show that [v];=P‘[v]. for any 
vector v € R?. (v) Show that [7];=P1[T]-P for the operator T on R? defined by 
T(x, y) = (2y, 83u—y). (See Problems 7.1 and 7.2.) 


(i) fy = (1,3) = le, + 8e9 1s 5) 
and B= ( 
fo = (2,5) = 2e, + 5e, 3.5 


(ii) By Problem 7.2, (a,b) = (2b—5a)f, + (83a—b)fo. Thus 
6; = 1,0) SASS ae oe fe 2 
ég = (0,1) = 2f;- fe 


: ro-G NG Glee 


2b — ba 
es 


rin = (2 2)0) = CSIP) = 


0 — =" 
(v) By Problems 7.1 and 7.2; in. = ( a and [TT]; = ( = a Hence 


2 
5 
(iv) If v=(a,b), then [v], oy and [vl]; = i: Hence 


18 29 
Pole? = (ya) peeat sy = ana iee 


7.13. Consider the following bases of R*: {e1 = (1,0,0), e2 = (0,1,0), es = (0,0,1)} and 
{f1 = (1,1, 1), f2=(1,1,0), fs = (1,0,0)}. (i) Find the transition matrix P from {ej} 
to {fi}. (ii) Find the transition matrix Q from {fi} to {ei}. (iii) Verify that Q = P-}. 
(iv) Show that [v];=P~*[v]e for any vector v € R*. (v) Show that [T];= P-1[T]-P 
for the T defined by T(x, y,z) = (2y+z, x—4y, 3x). (See Problems 7.4 and 7.5.) 


(i) fy = (1,1,1) = tle, + ley.+ leg a at bee | 
fy = 4,4, 0) = Ley dey ade, and = P = sao 
fs = (a 0; 0) = le; a Oe, => 0es 0 0 


CHAP. 7] 


7.14. 


7.15. 


MATRICES AND LINEAR OPERATORS 163 


(ii) By Problem 7.5, (a,b,c) = cf, + (b—c)f, + (a—b)fs. Thus 


ey = (1,0,0) = Of, + Of, + If Os eal 
ég = (0,1,0) = Of, + 1f, — 1f, and Q= {0 1-1 
By = (0, 0, 1) = Il = ij Oe 1 -—1 0 
i 1 il 1 Ole Olea He 
(iii) JAG) popes. Sake Ee Sk Delle sak Sah alte ae 0 = I 
1 0/\1 -1 0 0 1 
a c 
(iv) If v=(a,b,c), then [v],=|b]| and [vy]; =| b-e Thus 
c a—b 
Das ste = al a G 
Pe aan 0, le Lal bes (ibe eul a Oly 
Ll amOWa\c a—b 
Vee | Boo Bie 8) 
(v)- By Problems:7.4(ii) and 7.5, [T], = |.1.-—4 0] and [T]; =| —6 —6 —2 Thus 
3905 10 6 5 —1 
OPO mle Olerone 1 eh Steer tere a & ey oS 
Pee e600 19 tet ae Oa i 0 | = 6" 62. | Se 
DSi OAS OF POHL) 1S 200 Os oma L 


Prove Theorem 7.4: Let P be the transition matrix from a basis {e} to a basis {fi} 
in a vector space V. Then for any v EV, Plv|s=[v]e. Also, [v]p= P-*[v]e. 


n 
Suppose, for ¢=1,:..3", fi = aye, + age + -°-- + ane, = & ae; Then P is the n-square 
matrix whose jth row is ‘4 


(Gj, Aaj, .. +) Ons) (1) 


n 
0 =hfyi tt hofe + = FS hfe pe kf; Then writing a column vector as the 


Also suppose 
i=1 


transpose of a row vector, 


[uly = (hy, he, .. +5 Kn)é (2) 
Substituting for f,; in the equation for v, 
n n n n n 
i = kif; = Ds k( aue;) = De ( = aks ej 
i=1 i=1 j=1 7=lNit 
n 
= = (a4 jk, a Ayko Fe Oars Onjkn)e; 
Accordingly, [v], is the column vector whose jth entry is 
a4 ;ky ap Ayko ae We0 + Anjky (3) 


On the other hand, the jth entry of P[v]; is obtained by multiplying the jth row of P by [v];, i-e. 
(1) by (2). But the product of (1) and (2) is (3); hence P[v],; and [v], have the same entries and thus 


P [v]; =, [vJe- 
Furthermore, multiplying the above by P~! gives P~1[v], = P~1Plv]; = [v];- 


Prove Theorem 7.5: Let P be the transition matrix from a basis {¢} toa basis {fi} in 
a vector space V. Then, for any linear operator T on V, [T]; = P~* [T]-P. 


For any vector v€V, P-1[T].P{v]; = P-1\T] [re = P[Tle = [T(v)]- 


164 MATRICES AND LINEAR OPERATORS [CHAP. 7 


But [T],[v]; = [T()],; hence P~4[T]_Plv]y = [F]lrly- 
Since the mapping vb [v]y is onto Kn, P~1[T],.PX =[T];X for every X € kK”. 


Accordingly, P~1[T],P = [T],. 


7.16. Show that similarity of matrices is an equivalence relation, that is: (i) A is similar 
to A; (ii) if A is similar to B, then B is similar to A; (iii) if A is similar to B and B is 
similar to C then A is similar to C. 


(i) The identity matrix I is invertible and J=I~—1. Since A =I—!AI, A is similar to A. 


(ii) Since A is similar to B there exists an invertible matrix P such that A =P !BP. Hence 
B= PAP~1=(P—!)-1!1AP~-! and P71! is invertible. Thus B is similar to A. 


(iii) Since A is similar to B there exists an invertible matrix P such that A = P~1BP, and since 
B is similar to C there exists an invertible matrix Q such that B = Q-!CQ. Hence A = 
P-1BP = P-1(Q-1CQ)P = (QP)~1C(QP) and QP is invertible. Thus A is similar to C. 


TRACE 


7.17. The trace of a square matrix A= (ai), written tr(A), is the sum of its diagonal 
elements: tr(A) = di: + 22+ +++ +@nn. Show that (i) tr(AB)=tr(BA), (ii) if A 


is similar to B then tr (A) = tr (B). 


(i) Suppose A = (a,;) and B= (b,). Then AB = (cy) where ¢, =  a4;bj,. Thus 
j=1 
n n 
(AB) = Bea = DD aydyi 

n 

On the other hand, BA = (dj) where dy, = & 6;a;,. Thus 
i=1 

n n n n n 

=1 j=1 i=1 1=1 j=1 


(ii) If A is similar to B, there exists an invertible matrix P such that A = P-1BP. Using (i), 


tr(A) = tr(P-1BP) = tr(BPP-1) = tr(B) 


7.18. Find the trace of the following operator on R?: 
T(x, Yy, 2) = (1% + a2y + sz, bia + bey + bsz, cru + coy + C32) 
We first must find a matrix representation of T. Choosing the usual basis {e,}, 


Gle|, hy Cie 
Live = es bs 
Gi @y~ G> 
and tr(T) = tr((T],) = a, + by + cg. 


i 
7.19. Let V be the space of 2 x 2 matrices over R, and let M = 3 4 . Let T be the linear 
operator on V defined by T(A) = MA. Find the trace of T. 


We must first find a matrix representation of 7. Choose the usual basis of V: 


= (6 0): = (5 0) B=(f a) m= (SO 


CHAP. 7] MATRICES AND LINEAR OPERATORS 165 
Phen Tene \/1e 30 0 
T(E,) = ME, = fs Os = ‘e . = 18, +08, + 3E, + 0B, 
by Pecos 1 en 
T(E) = ME, = is le 4 = Ks So = OF, + 1E, + 0F3 + 3K, 
ten '\/020 2 0 
T(E3) = ME; = G mG 5) = ie ‘A = 2K, + 0F, + 4B; + 0F, 
LYE 2N70:420 O12 
T(E.) = ME, = c wae A = Ss “a = OF, + 2E, + 0F; + 4E, 
Hence es) (0) 
(ges leek) 
4h = 
[Te S00 4570 
0. 342044 


and tr(T) =1+1+4+4 = 10. 


MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 
7.20. Let F':R* > R? be the linear mapping defined by F(a, y, z) = (8a + 2y — 4z, x —5y + 82). 
(i) Find the matrix of F in the following bases of R* and R?: 
tj1 = 1, 1, 1), fo = G1, 0), fa = (1 50,0)} >> {91 =1(1,3), ge = (2,5)} 


(ii) Verify that the action of F' is preserved by its matrix representation; that is, for 
any v ER’, [F]}/v]; = [F(v)]o- 


(i) By Problem 7.2, (a,6) = (2b—5a)g; + (8a—6)go. Hence 


F(f,) od FAA, Y) = (i; —1) = Se Qaibats Ag» 

lif <8) lp! 
F(f.) = F(1,1,0) = (5,—4) = —33g, + 1995 and [Ff = ( asc 4) 
F(fs) = F(1,0,0) = (8,1) = —139,;+ 892 


(ii) If v = (a,y,z) then, by Problem 7.5, v = zf, + (y—2)fo + («—y)fs. Also, 
F(v) = (8a + 2y—4z,%—5y+8z) = (—13% —20y + 26z)9, + (8% + lly — 15z)g9, 


. _ /-18« — 20y + 262 
Hence [vl], = | ¥—2 and [F(v)|, = ( eet ee Thus 
uv Y 
z 
coe | > agate eed 3) —13% — 20y + 262 
9 = = == = F 
Le ee ( 4 19 )l. } ( 8x + lly — 15z ) ola 
%—Y 


7.21. Let F:K"*> K™ be the linear mapping defined by 
F(a, 2, .. +) Xn) = (Gait +++ + Ain@n, Mar@1 t +++ + Aandn, «5 Amit + * 2° + Omndn) 
Show that the matrix representation of F relative to the usual bases of K" and of K™ 


is given by 


[Ele 


a) let altel 6 ene 9.6) © (6,0) 0-0 0118) 6: |e 


166 MATRICES AND LINEAR OPERATORS [CHAP. 7 
That is, the rows of [F] are obtained from the coefficients of the x; in the components 
of F(a, ..., Un), respectively. 

Fd, 0, sy 0) = (444, a1, ++ +5 m1) a1 a2 Din 
Mn er a 
F(0,0, »1l) = (Q1n; Dons ++ +9 Amn) Ami Me Gmn 
7.22. Find the matrix representation of each of the following linear mappings relative to 
the usual bases of R”: 
(i) F:R?>R® defined by F(x, y) = (8u—y, 2x +4y, 5a — 6y) 
(ii) #:R*>R? defined by F(x, y, s,t) = (8a—4y + 2s — 5t, 5a + Ty —8— 2t) 
(iii) F:R?>R* defined by F(a, y, 2) = (2e+3y—8z,x+y+2, 4x” — 5z, by) 
By Problem 7.21, we need only look at the coefficients of the unknowns in F(x, y, ...). Thus 
Auer We BRS} 
: 2 eS en ee " US are 
@) AED = F 4 (ii) [F] oat ohn ae (ii). [F] = pelt Pes 
Oe x) 

7.23. Let T:R?>R* be defined by T(x, y) = (2x—3y,”%+4y). Find the matrix of T in 
the bases {e:=(1,0), ee =(0,1)} and {f:= (1,3), fe=(2,5)} of R* respectively. 
(We can view T as a linear mapping from one space into another, each having its 
own basis.) 

By Problem 7.2, (a,b) = (2b—5a)f,; + (3a—6)f,. Then 
Tey TG, 0) 2 (231) = Sere Pa (Ty = ( i 
T(és) = T(0,1) = (—8,4) = 28f, — 18f, 5 —13 
2D 5 —3 ° . . 9 
124. Let A = ee taeaae Recall that A determines a linear mapping F':R*> R® de- 


fined by F(v) = Av where v is written as a column vector. 


(i) Show that the matrix representation of F relative to the usual basis of R*® and 
of R? is the matrix A itself: [F] =A. 


(ii) Find the matrix representation of F' relative to the following bases of R* and R?’. 
{fi = el 1 iL) fo = (iy 1, 0), fs = (le 0, 0)}, {91 = @; 3), J2— (2, 5)} 


v 2 bea ie 2 
roy =F 2 %)[¢) = G) 
0 


2e; Se les 


x 
—~ 
SS 
SS 
eS, 
=, 
l| 
ae 
i 189) 
~ OU 
a Ww 
Se 
SS 
- & 
ae ee 
| 
Sana 
| 
ee Ot 
es 
| 
On 
d 
i 
| 
~~ 
fos) 
) 


II 


— 
7 => —8e, ae Tes 


(Compare with Problem 7.7.) 


0 
F(0,0,1) = ie 2 (9 


from which [F] = i iS A = A. 


(ii) By Problem 7.2, (a,b) = (2b—5a)g, + (8a—b)g5. Then 


CHAP. 7] 


2 5 -8 
he @ —4 = 
2 5 -8 
Her a, it] = 
2 5 -3\/1 


7.25. 


basis of U such that the matrix representation A of F has the form A = 


MATRICES AND LINEAR OPERATORS 


4 

4 
7 

ee = —41g, + 249, 
2 

ca = —89, + 595 


where J is the r-square identity matrix and r is the rank of F. 


Suppose dimV =m and dimU=n. Let W be the kernel of F and U’ the image of F’.. We 
are given that rank F = r; hence the dimension of the kernel of F is m—r. Let {w4, 
be a basis of the kernel of F' and extend this to a basis of V: 


{V1, sey Ur, Wy, 
Set 


" We note that Nila oes (Ly 1S an basis 


{uy, sety Up) Uptiy ++ 
of U. Observe that 
F (v4) = Of = jek ar Weis ae 6 
F(vo) = 0h = Wir sedWeyare oo 
F(v,) = 0 = Wen saWinae © 
F(wy,) = (i) = Ou, + OU, + °- 
JKC = == MI Ere US Ge 


Uy, = F(vy), Uy = F(v5), -. 


of U’, the image of Ff. Extend this to a basis 


sian Wry eh 

+) U, = F(v,) 

+) Unt 
2 FU ae Vipers a3) 000 SN, 
: te Ow. = Ot, 4 ate ay se Was 
a dees a UP teh oe ree + Ou, 
2 Sr Odie a7 Wine ae 90° se WOE, 
SEs Sie ap O00 Se (ire, 


Thus the matrix of F' in the above bases has the required form. 


Supplementary Problems 


MATRIX REPRESENTATIONS OF LINEAR OPERATORS 
7.26. 


7.27. 


7.28. 


for any v € R?’. 


Find the matrix of each operator T in Problem 7.26 in the basis {g; = (1,3), 92 = (1, 4)}. 


167 


Prove Theorem 7.12: Let F:V—>U be linear. Then there exists a basis of V and a 
LO 
(oo) 


AoA es 


Find the matrix of each of the following linear operators T on R? with respect to the usual basis 
{e, = (1,0), eg = (0,1)}: (i) T(@, y) = (2% —38y, x+y), (ii) T(x, y) = (Se + y, 8a — 2y). 


Find the matrix of each operator T in the preceding problem with respect to the basis {f; = (1,2), 
fo = (2,8)}. In each case, verify that [T],(v] = [T)] 5 


168 


7.29. 


7.30. 


7.31, 


7.32. 


MATRICES AND LINEAR OPERATORS [CHAP. 7 


Find the matrix representation of each of the following linear operators T on R® relative to the 
usual basis: 

(i) T(x, y, 2) = (a, y, 0) 

(ii) T(«, y, 2) = (2u—Ty — 42, 8a + y+ 4z, 6a — 8y + 2) 

(ii) Day, <)) — @ny +z, oor Yi) 


Let D be the differential operator, i.e. D(f) = df/dt. Each of the following sets is a basis of a 
vector space V of functions f:R—R. Find the matrix of D in each basis: (i) {e', e, te2t}, 
(ii) {sin t, cos t}, (iii) {e5t, te5t, t2e54}, (iv) {1,t, sin 3t, cos 3¢}. 


Consider the complex field C as a vector space over the real field R. Let T be the conjugation 
operator on C, ie. T(z) = 2%. Find the matrix of T in each basis: (i) {1, 7}, (ii) {1 +4, 1+ 2¢}. 


a b 
Let V be the vector space of 2 X 2 matrices over R and let M = e d 


of the following linear operators T on V in the usual basis (see Problem 7.19) of V: (i) T(A) = MA, 
(ii) T(A) = AM, (iii) T(A) = MA —AM. 


) . Find the matrix of each 


Let 1y and Oy denote the identity and zero operators, respectively, on a vector space V. Show that, 
for any basis {e;} of V, (i) [ly]. =J, the identity matrix, (ii) [Oy], =0, the zero matrix. 


CHANGE OF BASIS, SIMILAR MATRICES 


7.34, 


7.37. 


7.40. 


Consider the following bases of R2: {e, = (1,0), e, = (0,1)} and {f, = (1,2), fo = (2,3)}. 


(i) Find the transition matrices P and Q from {e,} to {f;} and from {f;} to {e}, respectively. 
Verify Q = P71. 


(ii) Show that [v] 


e — Pv]; for any vector v € R’. 
(iii) Show that [T],; = P- 


1/T|.P for each operator T in Problem 7.26. 
Repeat Problem 7.34 for the bases ~ {f,; = (1,2), f. = (2,3)} and {g9; = (1,8), go = (1,4)}. 


Suppose {e,,é)} is a basis of V and T:V-V is the linear operator for which T(e,) = 3e, —2e, 
and T(és) = e¢;+4e . Suppose {f,, fo} is the basis of V for which f,;=e,+e, and f, = 2e, + 3éo. 
Find the matrix of T in the basis {f,, fo}. 


Consider the bases B= {1,7} and B’ = {1+%,1+27} of the complex field C over the real field 
R. (i) Find the transition matrices P and Q from B to B’ and from B’ to B, respectively. Verify 
that Q =P}. (ii) Show that [T],,=P~1[T]pP for the conjugation operator T in Problem 7.31. 


Suppose {e;}, {f;} and {g;} are bases of V, and that P and Q are the transition matrices from {e;} 
to {f;} and from {f;} to {g;}, respectively. Show that PQ is the transition matrix from {e;} to {g;}. 


Let A be a 2 by 2 matrix such that only A is similar to itself. Show that A has the form 
Gs ot fs e 
0a 


Show that all the matrices similar to an invertible matrix are invertible. More generally, show that 
similar matrices have the same rank. 


Generalize to n X n matrices. 


MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 


7.41. 


Find the matrix representation of the linear mappings relative to the usual bases for R”: 
(i) &:R8 > R? defined by F(x, y,z) = (2% —4y+ 9z, 5a + 3y — 22) 

(ii) FF: R2—>R* defined by F(a, y) = (8” + 4y, 5a —2y, «+ Ty, 4a) 

(iii) F:R4>R defined by F(a, y, s, t) = 2a+8y—Ts—t 

(iv) F:R->R? defined by F(a) = (8a, 5x) 


CHAP. 7| MATRICES AND LINEAR OPERATORS 169 


7.42. Let F:R?—> R2 be the linear mapping defined by F(x, y,z) = (2a+y—z, 3a— 2y + 42). 
(i) Find the matrix of F in the following bases of R® and R?: 


{fy = as il 1). fo = (le i 0), fs = ae 0, 0)} and {91 = (Cle 8), J2 = (Ql, 4)} 


(ii) Verify that, for any vector v € R3, [Ff [vl]; = [F'(v)],- 


7.43. Let {e;} and {f;} be bases of V, and let 1y be the identity mapping on V. Show that the matrix of 
se ie - paeee {e;} and {f;} is the inverse of the transition matrix P from {e;} to {f;}; that is, 
VAC ee aa 


7.44. Prove Theorem 7.7, page 155. (Hint. See Problem 7.9, page 161.) 
7.45. Prove Theorem 7.8. (Hint. See Problem 7.10.) 
7.46. Prove Theorem 7.9. (Hint. See Problem 7.11.) 


7.47. Prove Theorem 7.10. (Hint. See Problem 7.15.) 


MISCELLANEOUS PROBLEMS 
7.48. Let T be a linear operator on V and let W be a subspace of V invariant under i thaters, 


f A B 
T(W) CW. Suppose dim W =m. Show that 7 has a matrix representation of the form ( ) 
where A is an m X m submatrix. 5 herhaiss 


7.49. Let V = U@W, and let U and W each be invariant under a linear operator 7':V > V. Suppose 


; : A 0O 
dim U=m and dimV =n. Show that T has a matrix representation of the form ( ) where 
A and B are m Xm and n X n submatrices, respectively. oe 


7.50. Recall that two linear operators F' and G on V are said to be similar if there exists an invertible 
operator T on V such that G= TFT. 
(i) Show that linear operators F' and G are similar if and only if, for any basis {e;} of V, the 
matrix representations [F], and [G], are similar matrices. 


(ii) Show that if an operator F is diagonalizable, then any similar operator G is also diagonalizable. 


7.51. Two m Xn matrices A and B over K are said to be equivalent if there exists an m-square invertible 
matrix Q and an n-square invertible matrix P such that B = QAP. 


(i) Show that equivalence of matrices is an equivalence relation. 


(ii) Show that A and B can be matrix representations of the same linear operator F:V—>U if 
and only if A and B are equivalent. 


: I 0 oie 
(iii) Show that every matrix A is equivalent to a matrix of the form € ; ) where I is the r-square 
identity matrix and r = rank A. 


7.52. Two algebras A and B over a field K are said to be isomorphic (as algebras) if there exists a bijective 
mapping f:A—B such that for u,v€A and kEK, (i) futv) =f(u)tf(r), (ii) f(ku) = kf(u), 
(iii) f(uv) = f(u)f(v). (That is, f preserves the three operations of an algebra: vector addition, scalar 
multiplication, and vector multiplication.) The mapping f is then called an isomorphism of A onto 
B. Show that the relation of algebra isomorphism is an equivalence relation. 


7.53. Let c4 be the algebra of n-square matrices over K, and let P be an invertible matrix in c4. Show 
that the map A /P-1!AP, where A €c4, is an algebra isomorphism of c4 onto itself. 


170 


7.26. 


7.27, 


7.28. 


7.29. 


7.30. 


7.31. 


7.32. 


7.34. 


7.35. 


7.36. 


7.37. 


7.41. 


7.42, 


MATRICES AND LINEAR OPERATORS 


Answers to Supplementary Problems 


; 18 25 . (723 —39 
Here (a,b) = (2b —3a)f, + (2a—b)fp. (i) ely, a) (it) ( 15 oe 


Here (a,b) = (4a —b)g; + (b — 3a)Qo. (i) 


ly <0 30 ie al 0 q : if : 
Om) 0 OF 0K 30) 
(i) ene | (ii) ) Gib) iQ  By  % (iv) 
Guys if 0 eee OF L0e ss 0e—=3) 
Om 20 cars. <0 
als 0 o/s 
@) e a (i) ae ar 
a 0s 6" 0 oe OF W @ =e b 0 
’ 0 Oe is bd 0" 0 Po = (G=G¢! 0 b 
(i) CoO a0 ey Wy OF Gh @ ay Cc 0 d—-a -e 
WG Or a oO W @ a 0 G —b 0 
Bioko Dy a2 —3 2 
ri e ay) ( 25—1 
By 35) aa tds 
J2 = 
(x, a) ~ es ee) 
8 ‘| 
@ -1 
il 1 2, —1 
j2 = - = 
« 5) Q ( a 
4 
eco A aire) Ar hs eo be ; 3 
(i) (; 3 3 (ii) fete (iii) (2,8, —7, -1) (iv) (|) 
a {) 


[CHAP. 7 


Chapter 8 


Determinants 


INTRODUCTION 


To every square matrix A over a field K there is assigned a specific scalar called the 
determinant of A; it is usually denoted by 


dena) eons -|Al 
This determinant function was first discovered in the investigation of systems of linear 


equations. We shall see in the succeeding chapters that the determinant is an indispensable 
tool in investigating and obtaining properties of a linear operator. 


We comment that the definition of the determinant and most of its properties also apply 
in the case where the entries of a matrix come from a ring (see Appendix B). 


We shall begin the chapter with a discussion of permutations, which is necessary for 
the definition of the determinant. 


PERMUTATIONS 


A one-to-one mapping o of the set {1,2,...,n} onto itself is called a permutation. We 
denote the permutation o by 


| WA aden ae Aer ; . : 
Pacey Dane : OL GC b= e912 ee dns where ji = a(Z) 
ji J2 fee Jn 
Observe that since o is one-to-one and onto, the sequence jij2... Jn is simply a rearrange- 
ment of the numbers 1,2,...,. We remark that the number of such permutations is !, 


and that the set of them is usually denoted by Sn. We also remark that if o€S,, then the 
inverse mapping o-!€ Sn; and if o,r€Sn, then the composition mapping oo7€S,. In 


particular, the identity mapping 
1 


€e = o°Q 


belongs to S,. (In fact, «-=12...n.) 


=e Ges OD, 


Example 8.1: There are 2! = 2°1 = 2 permutations in Sp: 12 and 21. 
Example 8.2: There are 3! = 3°2+*1=6 permutations in Sg: 128, 1382, 213, 231, 312, 321. 
Consider an arbitrary permutation o in Sn: o=Jijo... jn We say o is even or odd 
according as to whether there is an even or odd number of pairs (7, k) for which 
i>k but  7precedes k ino (*) 


We then define the sign or parity of o, written sgn o, by 
1 ifocis even 


sitet Teeter esis odd 


ile 


172 DETERMINANTS [CHAP. 8 


Example 8.3: Consider the permutation o = 35142 in S;. 
3 and 5 precede and are greater than 1; hence (3,1) and (5, 1) satisfy (*). 
3, 5 and 4 precede and are greater than 2; hence (8, 2), (5, 2) and (4, 2) satisfy (*). 
5 precedes and is greater than 4; hence (5,4) satisfies (*). 


Since exactly six pairs satisfy (*), o is even and sgno —1. 
Example 8.4: The identity permutation «=12...n is even since no pair can satisfy (*). 


Example 8.5: In S5, 12 is even, and 21 is odd. 
In S3, 123, 231 and 312 are even, and 132, 213 and 321 are odd. 


Example 8.6: Let 7 be the permutation which interchanges two numbers 7 and 7 and leaves the 
other numbers fixed: 
(i) = j, UU) =4. rh) = kh, kKAGG 


We call 7 a transposition. If i<j, then 
¢ S-12)) GD) De Dee 
There are 2(j —-i—1)+1 pairs satisfying (*): 
(;4), G,%), (@, 2), where “=a gai 


Thus the transposition 7 is odd. 


DETERMINANT 
Let A = (ai) be an n-square matrix over a field K: 
Qi1 = A12 Qin 
‘Agee Q21 A22 on 
itis ne ean 


Consider a product of n elements of A such that one and only one element comes from each 
row and one and only one element comes from each column. Such a product can be written 
in the form 
Q1j, A2j..-- Qnj, 

that is, where the factors come from successive rows and so the first subscripts are in the 
natural order 1,2,...,n. Now since the factors come from different columns, the sequence 
of second subscripts form a permutation o = j1j2...jn in Sn. Conversely, each permuta- 
tion in S, determines a product of the above form. Thus the matrix A contains 2! such 
products. 


Definition: The determinant of the n-square matrix A = (aij), denoted by det (A) or jo 
is the following sum which is summed over all permutations o = 7172... 4n 
ISA: 
|A| = ys (sen a) Qj, A2), - ++ Anj, 


o 


That is, |A| = (SEN o) A101) Meo(2) .. . Anon) 
Ce SZ 


The determinant of the n-square matrix A is said to be of order n and is frequently 
denoted by 


© 10 Ob 6. 6) .¢ ©. 6 © 6 cle.) 6) 6 6 


CHAP. 8] DETERMINANTS 173 


We emphasize that a square array of scalars enclosed by straight lines is not a matrix but 
rather the scalar that the determinant assigns to the matrix formed by the array of scalars. 


Example 8.7: The determinant of a 1X1 matrix A= (a44;) is the scalar a4, itself: |A|] = a4. 
(We note that the one permutation in S, is even.) : 


Example 8.8: In So, the permutation 12 is even and the permutation 21 is odd. Hence 
a a 
4 me = 41429 — M19A94 
G21 Ao 
4 —5 b 
Thus | ; = A(= 2) (Bl) 1S Sand ls : = ad — be. 
-1 —- c 


Example 8.9: In S3, the permutations 123, 231 and 312 are even, and the permutations 321, 213 and 
132 are odd. Hence 


G11 AjQ A483 
M21 Ag2 Ags = WyyzAg0Ag3 1 AyoAg3Mg1 1+ Ay3M91Mg9 


M31 G32 433 — &13%29031 — Gy9%21433 — G11 %23M39 


This may be written as: 


@11(Ag9%33 — Ay3M32) — Gyo(Mq1433 — Ay3M31) + G13(d21432 — @g2433) 
| 
Goo M3 GH, A938 Hoy Ag0 | 
or Ay oa A190 35 a43 
32 33 3, 433 431 Azo 


7 which is a linear combination of three determinants of order two whose coefficients 
(with alternating signs) form the first row of the given matrix. Note that each 
2 xX 2 matrix can be obtained by deleting, in the original matrix, the row and column 
containing its coefficient: 


Gi G2 Ma i 8 Us ayy Ap A%3 
G11] Go, Ugg Ag3 — Gyo} Go; Weg oz + G13] G1 Ggq gg 
M31 Azo M33 M31 G39 A338 M31 %32 33 
: geass ee Cee 3 iipes aie ray Beno 
Example 8.10: (i) Bey (6) is = i Re $ ifkg 
Sat On wal! 
= 2(6—68)— 3(5—56) + 4(45—48) = 27 
DP By eal 
Ey 5) OseEZ 0-4 
ii = = ——3 sp (4! 
(ii) |0 —4 2 gist 4 E ; ( ihar 
Wee e=u bree5) 
= 2(-20+2) — 3(00—2) — 400+4) = —46 


As n increases, the number of terms in the determinant becomes astronomical. Accord- 
ingly, we use indirect methods to evaluate determinants rather than its definition. In fact 
we prove a number of properties about determinants which will permit us to shorten the 
computation considerably. In particular, we show that a determinant of order n is equal 
to a linear combination of determinants of order n — 1 as in case n= 3 above. 


PROPERTIES OF DETERMINANTS 
We now list basic properties of the determinant. 


Theorem 81: The determinant of a matrix A and its transpose A‘ are equal: |A| = |A‘l. 


174 DETERMINANTS [CHAP. 8 


By this theorem, any theorem about the determinant of a matrix A which concerns the 
rows of A will have an analogous theorem concerning the columns of A. 

The next theorem gives certain cases for which the determinant can be obtained 
immediately. 


Theorem 8.2: Let A be a square matrix. 
(i) If A has a row (column) of zeros, then |A| = 0. 
(ii) If A has two identical rows (columns), then |A| = 0. 
(iii) If A is triangular, i.e. A has zeros above or below the diagonal, then 
|A| = product of diagonal elements. Thus in particular, |J| = 1 where 
I is the identity matrix. 


The next theorem shows how the determinant of a matrix is affected by the “elementary” 
operations. 


Theorem 8.3: Let B be the matrix obtained from a matrix ‘A by 
(i) multiplying a row (column) of A by a scalar k; then |B| = k|A|. 
(ii) interchanging two rows (columns) of |A|; then |B| = —|A|. 
(iii) adding a multiple of a row (column) of A to another; then |B| = |A|. 


We now state two of the most important and useful theorems on determinants. 


Theorem 8.4: Let A be any n-square matrix. Then the following are equivalent: 
(i) A is invertible, i.e. A has an inverse A7?. 
(ii) A is nonsingular, i.e. AX = 0 has only the zero solution, or rank A =n, 
or the rows (columns) of A are linearly independent. 
(iii) The determinant of A is not zero: |A| 4 0. 


Theorem 8.5: The determinant is a multiplicative function. That is, the determinant of 
a product of two matrices A and B is equal to the product of their deter- 
minants: |A B| = |A||BI. 


We shall prove the above two theorems using the theory of elementary matrices (see 
page 56) and the following lemma. 


Lemma 8.6: Let EF be an elementary matrix. Then, for any matrix A, |E A| = |E||A|. 


We comment that one can also prove the preceding two theorems directly without 
resorting to the theory of elementary matrices. 


MINORS AND COFACTORS 


Consider an n-square matrix A = (ai). Let Mi; denote the (n—1)-square submatrix of 
A obtained by deleting its ith row and jth column. The determinant |Mjj| is called the minor 
of the element ai; of A, and we define the cofactor of aij, denoted by Ai, to be the “signed” 


minor: ve 
Ag = (1) My 


Note that the “signs” (—1)'t’ accompanying the minors form a chessboard pattern with 
+’s on the main diagonal: 


Ce ORG iCreticetia seen) Cia Cece co 


We emphasize that Mi; denotes a matrix whereas Aj; denotes a scalar. 


CHAP. 8} DETERMINANTS 175 


2 
Example 8.11: Let A = |5 
8 


oD Ww 


4 2 3.4 

2 3 
7|. Then Mo, =| 6 6 7 = ( ) and 
1 8-9 1 


2 3 


Al = —])2+3 
23 Gi) Bae 


| = -(18—24) = 6 
The following theorem applies. 


Theorem 8.7: The determinant of the matrix A = (aj) is equal to the sum of the products 
obtained by multiplying the elements of any row (column) by their re- 
spective cofactors: 


JA] = @uAa + advAe + +++ + GinAin = 2 dijAij 
f=1 
and |A| = a1jA4; ae A2j;A9; ap 0.00 5 OnjAnj — >S aijAij 


i=1 


The above formulas, called the Laplace expansions of the determinant of A by the ith 
row and the jth column respectively, offer a method of simplifying the computation of |A|. 
That is, by adding a multiple of a row (column) to another row (column) we can reduce A 
to a matrix containing a row or column with one entry 1 and the others 0. Expanding by ~ 
this row or column reduces the computation of |A| to the computation of a determinant of 
order one less than that of |A|. 


Bh ih A Gl 

, 2 oe lee 

Example 8.12: Compute the determinant of A = BS ae 6 
1 —2 -1 4 


Note that a 1 appears in the second row, third column. Perform the following 
operations on A, where F&; denotes the ith row: 


(i) add —2R, to Ri, (ii) add 38Ro to Rs, (iii) add 1R, to R4. 


By Theorem 8.8(iii), the value of the determinant does not change by these opera- 
tions; that is, 


Br ey Ae Th 12 05 
2 eon mle ars 2 eed 
gales reas wohl a «it Ou 3 
1-2 -1 4 Bak ye 
Now if we expand by the third column, we may neglect all terms which contain 0. 
Thus 
1-2 0 5 
=—2, 5 
2 3 1-2 
|A| = (-1)2+8 = DQ B 
1 0 3 1 2 
8 1 6 2 
BB In * 8 ib Ay 
=i 5 a=" (= San) = Be 
(i ele alls al} 


CLASSICAL ADJOINT 
Consider an n-square matrix A = (aij) over a field K: 


dii Ai2 ees Qin 


Ae 


fe) -6Y.¢. Neue, s—a; .'076: 10) “e) -a" 0 @) ¢ 


176 DETERMINANTS [CHAP. 8 


The transpose of the matrix of cofactors of the elements ai of A, denoted by adj A, is called 
the classical adjoint of A: 


Au Ax wet Ant 
adj A = A Ao omeiKe Ani 
A1n Aon ae 


We say “classical adjoint” instead of simply “adjoint” because the term adjoint will be used 
in Chapter 13 for an entirely different concept. 


2 38-4 
Example 8.13: Let A=/0 —4 2]. The cofactors of the nine elements of A are 
lel! 5 
ZAR? 0° 2 = Oe Nees 
An = + | 5 == = Ife. An = -|) ate 2 Ay = +|¢ ot] 4 
3 —4 2 —4 Ph 
Ay = — =—1l, Ajo = + = tee Alpe SS — wey 
w2)(34)2at gee a 
See: Wi tel DiS: 
pas Wail oy aa 10, Age S | 33 ee” 
We form the transpose of the above matrix of cofactors to obtain the classical adjoint 
of A: 
—18 —11 —10 
adj A = 2 14 —4 
4 SS 


Theorem 8.8: For any square matrix A, 
A+(adj A) = (adj A)-A = |A|I 
where J is the identity matrix. Thus, if |A| +0, 
1 
flees : 
= A] (adj A) 


Observe that the above theorem gives us an important method of obtaining the inverse 
of a given matrix. 


Example 8.14: Consider the matrix A of the preceding example for which |A| = —46. We have 
2 3 —4\ /—18 —11 —10 —A Gin 0) 0 1 O10 
Aexch yy 2 | OQ =< & 2 44 el OR 46m 10 40) (eel 
i ib 4 By fs Q 0 —46 OVO 
= —461 = |A|I 
We also have, by Theorem 8.8, 
, —18/—46° —11/—46 —10/—46 9/23) 11/46 5/28 
AV a TAy (295 A) 2/—46 14/—46 —4/—46 | = | —1/23 —7/23 2/23 
4/—46 5/—46 —8/—46 —2/23 —5/46 4/28 


APPLICATIONS TO LINEAR EQUATIONS 
Consider a system of n linear equations in m unknowns: 


11%. + Ayo%2 + +++ + Ain®n = D1 
Aeit1 + Goes =F *** + Oaetn = De 


nit <b Uneke a > 2 ann On 


CHAP. 8} 
DETERMINANTS 177 


Let A denote the determinant of the matrix A = (ai) of coefficients: A=|A]. Also, let A; 
denote the determinant of the matrix obtained by replacing the ith column of A by the 


column of constant terms. The fundamental relationship betw i 
; : etween determ 
solution of the above system follows. DSTI UNL: 


Theorem 8.9: The above system has a unique solution if and only if A+0. In this case 
the unique solution is given by 
At Ag 


1 = v2 = i) Ly = 


As An 
Ne ye’ 


A 


The above theorem is known as “Cramer’s rule” for solving systems of linear equations. 
We emphasize that the theorem only refers to a system with the same number of equations 
as unknowns, and that it only gives the solution when A~0. In fact, if A=0 the theorem 
does not tell whether or not the system has a solution. However, in the case of a homo- 
geneous system we have the following useful result. 


Theorem 8.10: The homogeneous system Az =0 has a nonzero solution if and only if 
A= |A|=0. 
Example 8.15: Solve, using determinants: ie thers is. 
BH9 se iy = i 
First compute the determinant A of the matrix of coefficients: 


23 
ONO 


A = = 10 +9 = 19 


Since A #0, the system has a unique solution. We also have 


Lift acs 7a Ef 
a aed 38, Ay = = —19 
iG 3 al 
Accordingly, the unique solution of the system is 
Eat 25385 Ay _ —19 
ramen age eee ae Th aTD 


We remark that the preceding theorem is of interest more for theoretical and historical 
reasons than for practical reasons. The previous method of solving systems of linear equa- 
tions, i.e. by reducing a system to echelon form, is usually much more efficient than by using 
determinants. 


DETERMINANT OF A LINEAR OPERATOR 
Using the multiplicative property of the determinant (Theorem 8.5), we obtain 
Theorem 8.11: Suppose A and B are similar matrices. Then |A| = |BI. 


Now suppose 7 is an arbitrary linear operator on a vector space V. We define the 
determinant of T, written det (T), by 
det (T) = |[T]e| 


where [7]. is the matrix of T in a basis {ei}. By the above theorem this definition is in- 
dependent of the particular basis that is chosen. 
The next theorem follows from the analogous theorems on matrices. 
Theorem 8.12: Let 7 and S be linear operators on a vector space V. Then 
(i) det (SoT) = det (S)- det (7), 
(ii) 7 is invertible if and only if det (7) #0. 


178 DETERMINANTS [CHAP. 8 


We also remark that det (lv) =1 where 1y is the identity mapping, and that det (77!) = 
det (T)~1 if T is invertible. 


Example 8.16: Let 7 be the linear operator on R* defined by 
T(x,y,2) = (2x —4y+z, x —2y + 82, da + y — 2) 


Be en AL 
The matrix of T in the usual basis of R’ is [T] = f ave j . Then 


5 1-1 
2-4 “I 

det(T) = |1 -2 8] = 2(2—8) + 4(-1—15)+1(1+10) = —55 
5 1-1 


MULTILINEARITY AND DETERMINANTS 


Let c4 denote the set of all n-square matrices A over a field K. We may view A as an 
n-tuple consisting of its row vectors A1, As,..., An: 


AS= (Ay, Agee s% An) 
Hence c4 may be viewed as the set of n-tuples of n-tuples in K: 
Ag=_(Kr)s 
The following definitions apply. 


Definition: A function D:c4>K is said to be multilinear if it is linear in each of the 
components; that is: 


(i) if row A; = B+C, then 

D(A)= D2, BAECS ASDA eer ae) Tee eee 
(ii) if row Ai=kB where k EK, then 

D(A)= D551 B ey = TaD (SB ee) 


We also say n-linear for multilinear if there are n components. 


Definition: A function D:c4->K is said to be alternating if D(A) =0 whenever A has 
two identical rows: 


D(Ax, As, -%., An) = 0 whenever Ai= Aj, 147 
We have the following basic result; here J denotes the identity matrix. 
Theorem 8.13: There exists a unique function D:c4 > K_ such that: 
(i) D is multilinear, (ii) D is alternating, (iii) DJ) = 1. 


This function D is none other than the determinant function; that is, for 
any matrix A €c4, D(A) =|A|. 


CHAP. 8] DETERMINANTS 179 


Solved Problems 


COMPUTATION OF DETERMINANTS 


pQ- = al 
8.1. Evaluate the determinant of each matrix: (i) oa se til) os eh sie 
os oes a a+b 
i S) S74 Ab 
(i) A 3 S330) = (Soa = Psy (ii) | ic ne = (a—b)(a+b)—ara = —62, 
8.2. Determine those values of k for which | : Ss 10) 
RISK 
Meo 2k? — 4k =0,/ or -2k(k— 2) = 0; Hence k= 0; and k= 2. Thatis, if k= or 


= 2, the determinant is zero. 


8.3. Compute the determinant of each matrix: 


LD aA eer Pa Oe) al oe Mee al Laer) 
ae = teresa (at). (40 Reels. Gil) (8.0 20817 ivy 8s said 
A ee 1 Bs, 3 at lay sea tat) AE 8? 
Abe Bete 8 
; ie So Sie ae 3 42 
o fea seals sal sles 
= 1(2—15) — 2(-4—6) + 3(20+4) = 79 
Be Oro ey, 
2-3 4 -3 2 
(ii) |4 2 -8 2\ 5) 0; | +25 ; Sen 
5 1 
2a) 4Oce TH 
Gi) | 892 —3| = 200=9) + 1(-9 +2) = —5 
—1 —3. 5 
Tt0es 0 
(iv) [8 2-4] = 16+4) = 10 
APL Es '8 
ai b1 Cy 
8.4. Consider the 3-square matrix A =| a2 6b: ¢:|. Show that the diagrams below can 
a3 bs C3 
be used to obtain the determinant of A: 
+ + + 


Form the product of each of the three numbers joined by an arrow in the diagram on the left, 
and precede each product by a plus sign as follows: 
+ 01963 ap b 46903 a5 C1Agb3 


180 DETERMINANTS [CHAP. 8 


Now form the product of each of the three numbers joined by an arrow in the diagram on the 
right, and precede each product by a minus sign as follows: 


— Agbycy — dgeoay — C3Agby 
Then the determinant of A is precisely the sum of the above two expressions: 
a, 0, C4 
|A| — as bo Co = azboe3 + by Coa + C1Aob3 rae AzboCy aa bao, aw C30od4 
a3; b3 Cg 


The above method of computing |A| does not hold for determinants of order greater than 3. 


8.5. Evaluate the determinant of each matrix: 


2 0-1 Gi 1b. @ 3 2 -4 
(iam vO! 2A | aie C% OM On ica BIT) 1 0-2 
ANS Oy Er 30 = 2 eae 


(i) Expand the determinant by the second column, neglecting terms containing a 0: 


Zee eal 


21 
Be Oh ae = “| 3 : = SOPs) Ss Ail 
w= B) 
(ii) Use the method of the preceding problem: 
Ch WRG 
ce a@ 6). = -a? +624 2 — abe.— abe — abe = a= 62 + co — sabe 
Or Oa} 


oo ie 4 Sr 2) othe lls) Saal Cae eer; 
1) 02 = 1A) PAG) = g Esc 0) = -3/7 3] ——$ 
SS. gay aaa) Siar Vises | 
4-1 -+ 
8.6. Evaluate the determinant of A= |2 4-1 
1-4 1 
First multiply the first row by 6 and the second row by 4. Then 
Se 6l 0 3 -6+4(3) —2—(3) 3 636 
6° 4\Al = 224 Alo= 13 Ve =a) = la” Baeaey aay eer 
ad 1-44.40) ~ 4— (4) if aa 
sh al hee A= = 
Pees Visaxe ; and |A| = 28/24 = 7/6. 
2 5 -38 -2 
8.7. Evaluate the determinant of A = =e ea 
it32-25 12 
seal ees bel 83 


Note that a 1 appears in the third row, first column. Apply the following operations on A 
(where R; denotes the ith row): (i) add —2R, to Ry, (ii) add 2R, to Ry, (iii) add 1R, to Ry. Thus 


CHAP. 8] DETERMINANTS 181 


| agg ae walt Oe thats iG 
dese at 2 PSO ES 10" Beart biuees 36 
7.82 are ye Ne og cco! 9 alba Eh 
5 et en ip ows Been ere 
-1+1 1 -6+6(1) On : 
Solel 2° 44 629) b= |) a 8 13 | = -| j ae = -4 
342 2 6446(2) eile oe hy ma 
8 —-2 -—5 A 
—5 2 8 -5 
8.8. Evaluate the determinant of A = 
oe By Stags 
2-8 —5 8 


First reduce A to a matrix which has 1 as an entry, such as adding twice the first row to the 
second row, and then proceed as in the preceding problem. 


SS ih 7 3 —2 S55 4 
Al = Op AS be br 2(8) 2 2(-22)) 8 2b) 5 24) 
=2 24 7528 —2 4 ql —3 
2-3-5 8 2 —3 —5 8 
3-2-5 4 3 —24+2(3) —5+2(3) 4 — 3(3) 
z 1-2-2 3] _ 1) eos 2 -F 21) 3 — 3(1) 
“) \22- 4° 7 =8] — |-2 442(=2) 74 2(—2) —3—38(—2) 
2-3 -5 8 2 —34+2(2) -—5+2(2) 8 — 3(2) 
eh ale Gye eae 4 1 —5-—(1) 
Se te nt eagles lr] Gye 8 8 (3) 
ieee 1-21) 12 joe) 
B Sh = & 
4 1-6 
EAE ie tes -3|' me =) —3(12456) = —bd 
1 =L-3 A 
t+3 -1 ih 
8.9. Evaluate the determinant of A = 5 t—3 1 
6 -6 ¢t+4 
Add the second column to the first column, and then add the third column to the second column 
to obtain 
ta 2 0 1 
Ate 2 eaten aL 


0 Bam ee eat! 


Now factor t+ 2 from the first column and t— 2 from the second column to get 


10> eat 
|A| = @#2)¢-2)]1 1° 1 
Geet 14 


Finally subtract the first column from the third column to obtain 
0 0 
AjP=FeF2at—2)i tt 0 = (t+2)(t—2)(t+4) 
) val. aap da! 


182 DETERMINANTS [CHAP. 8 
COFACTORS Pi eo eee 
: : \ 5 40 Tee 
8.10. Find the cofactor of the 7 in the matrix Wee eS 
8-2 5 2 
2 12s 4 ; 
5-4-7 2 ee ee. 4-8 
(—1)2+3 = AL BS 3) a Prin = wil 
4-00-38 B22 2 7 0 10 
3-2 8 2 
The exponent 2+ 3 comes from the fact that 7 appears in the second row, third column. 
1253 
8.11. Consider the matrix A =/|2 3 4]. (i) Compute |A]. (ii) Find adj A. (iii) Verify 
Lea 5 ael. 
A+(adjA) = |A|I. (iv) Find A-. 
3 4 2 4 Ph 3} 
A|l = — 2 Bas = tl = Piss Mb SS 
ae) 3 tf 2 f |; 4 
g 4 BA By Bil \ & 
+5 7 =| 7 +; 5 : 
2 3 aes a) 1 = iG 7 1h leat 
Gn) ahh zl = = +| | =| | = 1 ASS es al Ab @ 
A ak ly i & 
=I 2 ell i =e Sil 
El 3 -|} | +/ 4 
3 4 2 4 m8} 
That is, adj A is the transpose of the matrix of cofactors. Observe that the “signs” in the 
ae = Se 
matrix of cofactors form the chessboard pattern | — + — 
5 =F 
tl Bo § 1 1-1 my th). i OM 
(Gi) AloleEGh AY) = 1B 8B Zila 2 & = pe Be = 210 <1 - 05) e=selaile 
Wo BY f = = O a) 2 0 1 
| np $444 
(iv) Agel = ay (24i A) = 5 =H Ae = ee eee 
if a8 
at ae Se ee 
; : ‘ a b 
8.12. Consider an arbitrary 2 by 2 matrix A = : 
Cc 


d 
(i) Find adj A. (ii) Show that adj (adj A) = A. 


@ aga = (TF Pipe ee fo See) 


Skndt AY axeddg (oe ON of PIGIS Sale Ni tien (A eC meena 
(ii) adj(adjA) = adj fe . = (dee ed on . 4s efi e 2 ate 


CHAP. 8] 


DETERMINANTS 


DETERMINANTS AND SYSTEMS OF LINEAR EQUATIONS 


8.13. Solve for x and y, using determinants: 


11 
3a —5y = 4 3au — 5by 

cl ye rd ae hi ae 
(i) A te 18, A, Deel = 

y=A,/A=1. 
a a —=2b iy =H 

A = = = 
(ii) oy ila ee | 


= eC 
oie where ab <0. 
= UG 
2 Ff 
B19), iN = = <—]'9- 
o Bred 
a Cc =a 
—be, ie tava. ey = —ac. Then 


8y+2% = 241 


8.14. Solve using determinants: 34% +22 = &8— by. 
See 1 ey 
First arrange the system in standard form with the unknowns appearing in columns: 
PP Sr BY = BS Ah 
Sn ar iar we S| 
49 =F, = 8 Sil 
Compute the determinant A of the matrix A of coefficients: 
Ysa 
NS |S 8 2 S&S Meliss!) = h-O=2) = ACs f) = 2 
l= 3} 


183 


then — Ay Avo} 


* = A,/A = 


Since A #0, the system has a unique solution. To obtain A,, A, and A,, replace the coefficients of 


the unknown in the matrix A by the column of constants. Thus 
1 2B =i my 3 =a EEE tl 
(ame MNOS tes Be ec GOs. SA, he Bh 2h 02 A ee 8s 5 a8 
=| —2, 3 i =i oe iL eae Sil 
and «=A,/A=3, y=A,/A=—1, z=A,/A=2. 
PROOF OF THEOREMS 
8.15. Prove Theorem 8.1: |A‘| = |A|. 
Suppose A = (a,;). Then At = (b,;) where bj; = aj. Hence 
[Atl = 3S (sen e) bigc1y beoca) «++ Onacny 
oeS, 
= > (sgn ¢) (1,1 %o(2),2 ++ + Yo(n),n 
cES, 
Let += 07-1. By Problem 8.36, sgn7 = sgno, and 
Qo(1),1 %(2),2 +++ Go(n),n = A171) M2702) +++ Urn) 
Hence AS = (sgn 7) 1 7(1) G27(2) +++ Mnr(n) 
o n 


However, as o runs through all the elements of S,, 7 = o~1 also runs through all the elements of 


S,, Thus |A*| = |A|. 


8.16. 


position which interchanges the two numbers 
interchanged. If A 


[A]. 


Prove Theorem 8.3(ii): Let B be obtained from a square matrix A by interchanging 
two rows (columns) of A. Then |B| = — 


We prove the theorem for the case that two columns are interchanged. Let 7 be the trans- 


corresponding to the two columns of A that are 


= (a,;) and B= (b;;), then bi; = d7¢j)- Hence, for any permutation og, 


184 


8.17. 


8.18. 


DETERMINANTS [CHAP. 8 


by0(1) Dacca) +++ Onocny = 17001) 27002) ++ + Un ta(n) 
Thus Bl = Se Geno) Osc ias Coa eee enecn) 
Ces, 
= (sgn o) 1 701) %270(2) +++ Un z0(n) 
oeS,, 
Since the transposition 7 is an odd permutation, sgn ro = sgnr*sgno = —sgno. Thus sgno = 
—sgn7o, and so 
Br == > (sgn 70) 4 70(1) MQ 70(2) +++ Uz0(n) 
ves, 


But as o runs through all the elements of S,, ro also runs through all the elements of S,; hence 
|B| = —|A|. 


Prove Theorem 8.2: (i) If A has a row (column) of zeros, then |A|=0. (ii) If A has 
two identical rows (columns), then |A| = 0. (iii) If A is triangular, then |A| = product 
of diagonal elements. Thus in particular, |J|=1 where J is the identity matrix. 


(i) Each termin |A| contains a factor from every row and so from the row of zeros. Thus each 
term of |A| is zero and so |A| = 0. 


(ii) Suppose 1+1+0 in K. If we interchange the two identical rows of A, we still obtain the 
matrix A. Hence by the preceding problem, |A| = —|A| and so |A| =0. 


Now suppose 1+1=0 in K. Then sgno=1 for every o€S,. Since A has two iden- 
tical rows, we can arrange the terms of A into pairs of equal terms. Since each pair is 0, the 
determinant of A is zero. 


(iii) Suppose A = (a;;) is lower triangular, that is, the entries above the diagonal are all zero: 
a,; = 0 whenever 7 < j. Consider a term ¢ of the determinant of A: 


t= (Seno) yi, U2ig +++ ni,» where o = tig... ty 
Suppose i; #1. Then 1<% and so a@4;, = 09; hence t=0. That is, each term for which 


Ay zk Isizero. 


Now suppose 2; =1 but 7,42. Then 2<% and so Asi. = 0; hence ¢=0. Thus each 
term for which 71,71 or %#2 is zero. 


Similarly we obtain that each term for which i; ~1 or 1,.#2 or ... or i, #™% is zero. 
Accordingly, |A| = @41@o9...Q,n = product of diagonal elements. 


Prove Theorem 8.3: Let B be obtained from A by 

(i) multiplying a row (column) of A by a scalar k; then |B| = k|A|. 

(ii) interchanging two rows (columns) of A; then |B| = —|A]. 

(iii) adding a multiple of a row (column) of A to another; then |B| = |A|. 
( 


i) If the jth row of A is multiplied by k, then every term in |A| is multiplied by k and so 
|B] =k|A|. That is, 
|B| = > (sgn o) Ont, om Beh (kaji,) soe An; 
o 


n 


n 


= k & (sgn o) Qi, Aig +++ Oni, = Kk |A| 
oO 
(ii) Proved in Problem 8.16. 


(iii) Suppose c times the kth row is added to the jth row of A. Using the symbol “\ to denote the 
jth position in a determinant term, we have 


|B| = = (sgn o) OG qi, ene (Cai, == ai.) ao Cie 


ZN Va 
= ¢ = (sgn a) On, CDi. fio 6 Aeiy, met Oni, ate = (sgn o) Oi, O25 rons) Wi, ee Any 


The first sum is the determinant of a matrix whose kth and jth rows are identical: 


> 


hence by Theorem 8.2(ii) the sum is zero. The second sum is the determinant of A. Thus 
|B] = ¢-0+ |A| = A. 


- 


CHAP. 8] DETERMINANTS 185 


8.19. 


8.20. 


8.21. 


8.22. 


8.23. 


Prove Lemma 8.6: For any elementary matrix HZ, |EA| = |E| |Al. 


¢: Consider the following elementary row operations: (i) multiply a row by a constant k +0; 
(ii) interchange two rows; (iii) add a multiple of one row to another. Let E,, E, and Es be the 
corresponding elementary matrices. That is, H,, Hy and Ey are obtained by applying the above 
operations, respectively, to the identity matrix J. By the preceding problem, 


Bala Wea ee 2a) El eT | Nia ce Z|] = 1 
Recall (page 56) that EA is identical to the matrix obtained by applying the corresponding 
operation to A. Thus by the preceding problem, 
|F,A| = k\A| = |E,||A], |2.A| = —|A| = |B,||A|, [#4] = |A| = 1[A| = [BQ] |A| 


and the lemma is proved. 


Suppose B is row equivalent to A; say B=E,En-1...E2E1:A where the E; are 
elementary matrices. Show that: 


(i) |B] = |Hn||En-1|...|E2||E,||A|, (ii) |B| 40 if and only if |A|~0. 


(i) By the preceding problem, |£,A| = |E,||A|. Hence by induction, 
|B| a |Z) |\Z,-1.-. BH, A| = |E.| E4 suaee |Z |Z, |A| 


(ii) By the preceding problem, H;~0 for each 7. Hence |B| #0 if and only if |A| #0. 


Prove Theorem 8.4: Let A be an n-square matrix. Then the following are equivalent: 
(i) A is invertible, (ii) A is nonsingular, (iii) |A| ~ 0. 


By Problem 6.44, (i) and (ii) are equivalent. Hence it suffices to show that (i) and (iii) are 
equivalent. 


Suppose A is invertible. Then A is row equivalent to the identity matrix J. But |IJ| ~0; hence 
by the preceding problem, |A| #0. On the other hand, suppose A is not invertible. Then A is row 
equivalent to a matrix B which has a zero row. By Theorem 8.2(i), |B| = 0; then by the preceding 
problem, |A| = 0. Thus (i) and (iii) are equivalent. 


Prove Theorem 8.5: |AB| = |A||B|. 
If A is singular, then AB is also singular and so |AB| =0=|A||B|. On the other hand if A 
is nonsingular, then A= E£,...H,E ,, a product of elementary matrices. Thus, by Problem 8.20, 


JA = |B,...H,Hy,1| = [Eyl ... (Bol |Bil ll] = |Eal--- Bel Fil 


and so Asie =F. HEB) = 12, |... |Bail2 1 (Bi. = 1A\B| 


Prove Theorem 8.7: Let A= (ai); then |A| = aaAi + ae2Ait+ +--+ +QinAin, where 
Aj is the cofactor of ai. 


Each term in |A| contains one and only one entry of the ith row (aj1, diz, ---; a;,) of A. Hence 
we can write |A| in the form 3 A . 
|A| = ayAjy + apAig + +°* + Gin Ain 
(Note Aj, is a sum of terms involving no entry of the ith row of A.) Thus the theorem is proved if 
we can show that Ei 
Ai = Ag = (-1)*4 (My 
where M;; is the matrix obtained by deleting the row and column containing the entry aij. (His- 
torically, the expression Ai; was defined as the cofactor of a,;, and so the theorem reduces to showing 
that the two definitions of the cofactor are equivalent.) 


186 


8.24. 


8.25. 


DETERMINANTS [CHAP. 8 


First we consider the case that i=, j7 =n. Then the sum of terms in |A| containing a,,, is 


Gran = Ann > (sgn a) 1601) % (2) +++ Mm—-1,0(m—1) 
o 


where we sum over all permutations o€S, for which o(n) =n. However, this is equivalent (Prob- 
lem 8.63) to summing over all permutations of {1,...,2—1}. Thus A%, = |Myp| = (—1)"*” |My]. 


Now we consider any i and j. We interchange the ith row with each succeeding row until it is 
last, and we interchange the jth column with each succeeding column until it is last. Note that the 
determinant |M;;| is not affected since the relative positions of the other rows and columns are not 
affected by these interchanges. However, the “sign” of |A| and of Aj; is changed n—7 and then 


n—j times. Accordingly, 
Ay = (tested |Myl = 1) |My 


Let A = (ai) and let B be the matrix obtained from A by replacing the ith row of A 
by the row vector (bi, ..., Din). Show that 


|B] = baAu + eA + +++ + DinAin 
Furthermore, show that, for 7 ~1, 
ajyAn + AeA +--+: + AjinAn = 0 


and d1jA1 + GA +--+: + AnjAni = 0 


Let B = (6,;). By the preceding problem, 
|B] = dy Bi + byBig + +++ + bin Bin 
Since B;; does not depend upon the ith row of B, B, =A; for 7=1,...,. Hence 
IB] = byAy + diAig + +++ + DinpAin 


Now let A’ be obtained from A by replacing the ith row of A by the jth row of A. Since A’ 
has two identical rows, |A’| = 0. Thus by the above result, 


|A’] = ag Ay + OpAy +++ + OA = 0 


Using |At| = |Al, we also obtain that 04; Ay; a 9; A 9; ap O00 Se OnjAni 0 


Prove Theorem 8.8: A:(adj A) = (adjA)-A = |A|J. Thus if |A| #0, At = 
(1/|A|)(adj A). 


Let A = (a,;) and let A*(adj A) = (6;;). The ith row of A is 
(dit, ign «+05 Gin) (1) 


Since adj A is the transpose of the matrix of cofactors, the jth column of adj A is the transpose of 
the cofactors of the jth row of A: 


(Aj1, Aja, ..., Ajy)* (2) 
Now 6;;, the ij-entry in A * (adj A), is obtained by multiplying (2) and (2): 
by = ayAjy + GigAjy + -+> + DinA jn 
Thus by Theorem 8.7 and the preceding problem, 
pees VADs sees ace om 
Of via Py eae! 
0 if 773 


Accordingly, A+ (adj A) is the diagonal matrix with each diagonal element |A|. In other words, 
A+(adjA) = |A|I. Similarly, (adj A)+A = |AlI. 


CHAP. 8] DETERMINANTS 187 


8.26. 


8.27. 


8.28. 


8.29. 


Prove Theorem 8.9: The system of linear equations Ax =b has a unique solution 


if and only if A= |A|~0. In this case the unique solution is given by a1 = Ai/A, 
Perea Noe uh ae Ae (A 


__ By preceding results, Ax =b has a unique solution if and only if A is invertible, and A is 
invertible if and only if A = |A| #0. 


; Now suppose A #0. By Problem 8.25, A-1= (1/A)(adj A). Multiplying Ax =b by A}, we 
obtain 


2 = A-1Ax = (1/A)(adj A)b (1) 
Note that the ith row of (1/A)(adj A) is (1/A)(Ay;, Ao, --->Anj. If b = (By, bo, ..., b,,)& then, by (2), 
%, = (1/A)(byAy;+ b:A9, + +++ + b,A,,) 


However, as in Problem 8.24; 
b,Ay; Se byAo; ar Ook ae b,A ni ==) Xs 


L 


the determinant of the matrix obtained by replacing the ith column of A by the column vector b. 
Thus «; = (1/A)A;, as required. 


Suppose P is invertible. Show that |P~1| = |P|7}. 
P-1P =I. Hence 1=|I| =|P-1P| =|P-1||P|, and so |P-1| = |P|-1. 


Prove Theorem 8.11: Suppose A and B are similar matrices. Then |A| = |B|. 


Since A and B are similar, there exists an invertible matrix P such that B=P-1AP. Then 
by the preceding problem, |B| = |P—1AP| = |P—1||A||P| = |A||P-4| |P| = |A|. 


We remark that although the matrices P~! and A may not commute, their determinants |P~}| 
and |A| do commute since they are scalars in the field K. 


Prove Theorem 8.13: There exists a unique function D:c4>K_ such that (i) D is 
multilinear, (ii) D is alternating, (iii) D(I)=1. This function D is the determinant 
function, i.e. D(A) = |Al. 


Let D be the determinant function: D(A) = |A|. We must show that D satisfies (i), (ii) and (iii), 
and that D is the only function satisfying (i), (ii) and (iii). 


By preceding results, D satisfies (ii) and (iii); hence we need show that it is multilinear. Suppose 
A = (a;;) = (Ay,A2,--.,An) Where A; is the kth row of A. Furthermore, suppose for a fixed 7, 


vA. 


fe S8 JB oe (Ons where B; = (b,,...,6,) and C;= (¢,...,€,) 
Accordingly, Ce RCs) PO) HO g AC oy hae agg Cig ee Og oe Cn 
Expanding D(A) = |A| by the ith row, 

DA) =D Aye, Bi Cy eG Age Se yA Pap Aig hi bh Omg 
(by + €,)Ajy + (bp + Cy)Ajg + +++ + (On + Cn) Ain 


(b;Aj1 + boAjg + +++ +b,Ain) + (C:Ain + CoAig t + °* + CpAin) 


However, by Problem 8.24, the two sums above are the determinants of the matrices obtained from 
A by replacing the ith row by B; and C; respectively. That is, 


DU Aiy = RDLAG Ava Bet C3, oncgn) 
AD Aegis cy btiy os ty Ay) A gy Sse Cis sex An) 
Furthermore, by Theorem 8.3(i), 
D(A.) KAgh slg = RDA; «. 5 Ay .<-5An) 


Thus D is multilinear, i.e. D satisfies (iii). 


188 DETERMINANTS [CHAP. 8 
We next must prove the uniqueness of D. Suppose D satisfies (i), (ii) and (iS. Tien oerear 
is the usual basis of K”, then by (iii), D(e1, eg, -.-)@n) = DU) =1. Using (ii) we also have (Problem 
8.73) that aS 
D(é;,, Cin» Sve eely é;,) a sen 0; where T= Vr eee Uy (1) 
Now suppose A = (a,;). Observe that the kth row A, of A is 
Ag = (G1) Ga, -+ +> Ucn) = A101 + M2€2 +otts H Onnen 
Thus D(A) = D(ayyey + °° * + yn@ns G21@1 + °° * 1 Genny ++» Un1€1 + 2°* + Gyrn€n) 
Using the multilinearity of D, we can write D(A) as a sum of terms of the form 
D(A) a = D(a; 2,» 22i5%ig» ON) Ani e;,) (2) 
= 3 (a4:, G20 +: Ani) D(e;,» 7 é,) 
where the sum is summed over all sequences i;i) ...%, where i, € {1,...,n}. If two of the indices 
are equal, say i;=%, but 7 #k, then by (ii), 
D(é;,, Cig» CO) ei) = 0 
Accordingly, the sum in (2) need only be summed over all permutations o = iji,...%,. Using (1), 
we finally have that 
D(A) = = (44;, G2i, -- + Gni,) D(C; 1) Cigr + - +» Gi.) 
= = (sgn o) Oyj, Arig +++ Uni,» where o = tio... tn 
Hence D is the determinant function and so the theorem is proved. 
PERMUTATIONS 


8.30. Determine the parity of o = 542163. 


Method 1. 
We need to obtain the number of pairs (i,j) for which i> Jj and i precedes j in o. There are: 
3 numbers (5, 4 and 2) greater than and preceding 1, 
2 numbers (5 and 4) greater than and preceding 2, 
3 numbers (5, 4 and 6) greater than and preceding 3, 
1 number (5) greater than and preceding 4, 
0 numbers greater than and preceding 5, 


0 numbers greater than and preceding 6. 


Since 3+2+3+1+0+0 = 9 is odd, o is an odd permutation and so sgno = —1. 
Method 2. 
Transpose 1 to the first position as follows: 
y Aaa es 


5b OAR2 S623 stole al be 4a oes 
Transpose 2 to the second position: 

1 AS 

Ibe 42565305 tol ec oeAmors 
Transpose 3 to the third position: 

1 oes Nees to (1253 150416 
Transpose 4 to the fourth position: 

Y a 
12,35 4°6 “to --F2 3 4 5.6 


Note that 5 and 6 are in the “correct” positions. Count the number of numbers “jumped”: 
3+2+3+1 = 9. Since 9 is odd, o is an odd permutation. (Remark: This method is essentially 
the same as the preceding method.) 


CHAP. 8] DETERMINANTS 189 


Method 3. 


‘ An interchange of two numbers in a permutation is equivalent to multiplying the permutation 
y a transposition. Hence transform o to the identity permutation using transpositions; such as, 


Bee Ae f2 SLOG Wi8 


1h Pe BB 


1b 974 = Biya sh a eh 


Por 2a 


EZ o ee OL Omi 


Ie 2 arom noe iG 


Since an odd number, 5, of transpositions was used, o is an odd permutation. 


8.31. Let o = 24513 and 7 = 41352 be permutations in S;. Find (i) the composition per- 
mutations roo and oo7, (ii) «71. 


Recall that o = 245138 and 7 = 41352 are short ways of writing 


DAY OES EL Us deo Seoul 
- Cc — == 

oda Bart aS i: eel rey 
which means 


cA et OA) = 25 ay) ee) Sal ih cays 


and 
Aj Ss24 qO)—=1 66) =]s, .OaH SS ench AO) aH 


(i) meow Am) Hie eh OmeA wnt 
Cee ay jean ea 

Pye ENS 3} and Ae AALS aire sy ey 

pe Ne aS dhe de Cave, Vey 

ih 6) Pal Sy 1 PAS) eh et! 


Thus r°o = 15248 and or = 12534. 


a a AE ee eae 
(ii) PL SG, oy Ao edits ee 


That is, o—1 = 41523. 


8.32. Consider any permutation o = jij2...jn. Show that for each pair (z, ) such that 
i>k and iprecedeskine 
there is a pair (i*, k*) such that 
4*<k* and  o(t*) >o(k*) (1) 
and vice versa. Thus o is even or odd according as to whether there is an even or 


odd number of pairs satisfying (1). 


Choose i* and k* so that o(i*)=% and o(k*) =k. Then i>k if and only if o(i*) > o(k*), 
and i precedes k in o if and only if i*< ics 


190 


8.33. 


8.34. 


8.35. 


8.36. 


DETERMINANTS [CHAP. 8 


Consider the polynomial g = g(#,...,%n) = | | (i—%)). Write out explicitly the 
polynomial g = g(21, X2, 3, £4). er 


The symbol [] is used for a product of terms in the same way that the symbol & is used for a 
sum of terms. That is, [] («; — ;) means the product of all terms (7;—%;) for which 7 < j. Hence 
i<j 


9 = g(%1,.-.,%4) = (x — %g)(%1 — %3) (21 — %4)(%_ = %3)(%_ — %4)(a%3 — %4) 


Let o be an arbitrary permutation. For the polynomial g in the preceding problem, 
define o(9) = [| (®oc — oc). Show that 


i<j 
g if cis even 
o(g) = Was 
—g ifacisodd - 
Accordingly, o(g) = (sgn o)g. 


Since o is one-one and onto, 


og) =_II (oa —%eq) = . IT. @i-*%)) 
Vj PLO OSE OM) 
Thus o(g) =g or o(y)=-—g according as to whether there is an even or an odd number of terms 


of the form (x;—«;) where i> j. Note that for each pair (i, 7) for which 
i<j and a(t) > of) (1) 


there is a term (a,j) — %g,j)) in o(g) for which o(i) > o(j). Since o is even if and only if there is an 
even number of pairs satisfying (1), we have o(g) =g if and only if o is even; hence o(g) = —g 
if and only if o is odd. 


Let o,7 © Sn. Show that sgn(zoc) = (sgn7z)(sgn oc). Thus the product of two even 
or two odd permutations is even, and the product of an odd and an even permutation 
is odd. 


Using the preceding problem, we have 
sgn (r°o)g = (7r°o)(g) = r(o(g)) = 7((sgn o)g) = (sgn 7)(sgn o)g 


Accordingly, sgn (r°o) = (sgn r)(sgn o). 


Consider the permutation o = 7ij2...jn. Show that sgn o~! = sgno and, for scalars 
Qij, 
Qj,1Mj2..- Ain = Qik, Gek, .. + Ank, where o!=kike... kn 


We have o 1°a=e, the identity permutation. Since e is even, o~1 and o are both even or 
both odd. Hence sgn o~! = sgna. 


Since o = jiJjo...J, is a permutation, Oj 11 jn + + Oj 
have the property that 


nt = Nk, Gk» cee Gnk,.* Then ky, ko, eeey ky 
o(k;) = 1, o(f%s) = 2, ..., o(k,) =n 
Metinai—si0 ommend seem LOM LOT ca lean a ys 
(co7)(t) = o(r(t)) = o(kj) = 4 


Thus o°7r =e, the identity permutation; hence 7 = o~}, 


CHAP. 8] DETERMINANTS 191 


MISCELLANEOUS PROBLEMS 
8.37. Find det (7) for each linear operator T: 
(i) T is the operator on R® defined by 
T(x, y,2) = (2% —z,"+2y—4z, 8x —38y +2) 


(ii) T is the operator on the vector space V of 2-square matrices over K defined by 


T(A) = MA where M = : of 


c od 
ZO 
(i) Find the matrix representation of 7 relative to, say, the usual basis: [T] = |1 2 —4 
=3 Wl 
Then 2 0-1 
deti(Ey wea LS y AN P= (ON =-. 19) 21 (= Se Oy" a TT: 
3} Se) al 


(ii) Find a matrix representation of T in some basis of V, say, 


Mien) Bo a) eG ga OF 


Then T(E,) = (@ ie 3) = G 3 = aE, + 0E, + cH; + 0B, 
T(E) = e LG 5 = (5 ‘) = OE, + aE, + 0B3 + cH, 
T(E;) = ie Ale ) = “ ) = bE, + 0E, + dE; + 0F, 
T(E,) = e DG = ts Hi = 0E, + bE, + 0E; + dE, 


a. Omer 0 
r Ona a OF ee q 
= n 
Thus [ Je b 0 4 0 2 
0 bO0d 
ae c= 0 Patna le K F 
eRe el a 10 ca0 | BAc|'B 0.0) ata? be bee? — 2abed 
ys Oe ha) <0 2 ie ear 
Oped 0s. 
bec 1 ae 
8.38. Find the inverse of A =|0 1 1 
Ore Od 
La & 
The inverse of A is of the form (Problem 8.53): A~1 = |0 1 z 
O01 
Set AA-1=J, the identity matrix: 
ioe Misoenl:\ lee ay lo ap ih aap age al sO sr0 
PAW Ata ae — | Omen alc Oamleeectei— ee) = oI z+1 =| Olen eee) ||| ee — eee 
OmeO TINO} sO eed 0 0 1 0 0.1 


Set corresponding entries equal to each other to obtain the system 
Vel Ow ay et de 0 2 l=. 0 


192 


8.39. 


8.40. 


DETERMINANTS [CHAP. 8 


hoe 
The solution of the system is « = —1, y=0, z=-—1. Hence A~! =|0 1 —1 
Ont Okemasl 


A~1 could also be found by the formula A~! = (adj A)/|A|. 


Let D be a 2-linear, alternating function. Show that D(A, B) =—D(B, A). More 
generally, show that if D is multilinear and alternating, then 
Dog Aye sy BROS Cee Dees 
that is, the sign is changed whenever two components are interchanged. 
Since D is alternating, D(A +B,A+B)=0. Furthermore, since D is multilinear, 
0 = D(A+B,A+B) = D(A,A+B) + D(B, A+B) 
D(A, A) + D(A, B) + D(B, A) + D(B, B) 


lI 


But D(A, A) =0 and D(B, B) = 0. Hence 


0 = D(A,B) + D(B,A) or D(A,B) = — D(B,A) 
Similarly, ORD) Caw Ava. By cee e Alc eis memes) 
= DD Reacts AN ee ro) tet (Ce A ae es ae) 


“ED (SS Beir A sei) ant Gee ee rae) 
aD eee S toes cD pxceets cing) (erection ae eR) 
epeValia ab! ONG a ay Abn dees cel IOC a8) Bicoy Ale tioN: 


an) 
Let V be the vector space of 2 by 2 matrices M = e 1) over R. Determine 


whether or not D:V>R is 2-linear (with respect to the rows) if (i) D(M)=a+d, 
(ii) D(M) = ad. 
(i) No. For example, suppose A = (1,1) and B= (8,38). Then 


yal 


3 3 


DA By = D( F 


) = As andes DANE eS D(s ) = 5 2D(A, B) 


(ii) Yes. Let A = (a,,a2), B = (by,b2) and C = (¢4,¢,); then 


Biko 
D(A, C) = ae es = a,c, and D(B,C) = D( 4 *) = bic, 


Cy Cy Cy Cy 


Hence for any scalars s,t ER, 


Dig en CenD ieee Say + thy 


a . ) = (sa, + tb;)cy 
= 8(a,C) + t(byc,) = s D(A, C) + tD(B, C) 
That is, D is linear with respect to the first row. 
Furthermore, 


DO, A) = D(a 2 =p) Sand? (CYB) = Die ) = eb, 
1 2 


Hence for any scalars s,t € R, 


Cc Cc 
D(C, sA+tB) = D # = t 
sor ) es + tby sado+ aa ea(say + tbs) 


= 8(¢jd) + t(eyb,) = sD(C,A) + tD(C,B) 
That is, D is linear with respect to the second row. 


Both linearity conditions imply that D is 2-linear. 


CHAP. 8] DETERMINANTS 193 


Supplementary Problems 
COMPUTATION OF DETERMINANTS 


8.41. Evaluate the determinant of each matrix: (i) ie 7 5 (Gb) € : 
af 3 =z) 


8.42. Compute the determinant of each matrix: (i) ian. Fad) ate i . 
—4 (p= Al =i! tains 


8.43. For each matrix in the preceding problem, find those values of t for which the determinant is zero. 


8.44. Compute the determinant of each matrix: 


sare i | Ba tl 7} Al im AOnegD 
(et Omron 2eieer oat) 20) Be 0] (iii) S Sep Ses tg) || Th al 
iss, 4 Oh, To el A wl. <2 a} Sa 


8.45. Evaluate the determinant of each matrix: 


cen anes | 3 1 eS =3 BT ODY es 1 
(i) De SHE at RR, boa tr] bee ace em cE Ree cag rg oe de ee al 
0 0. 4-— 4 =6 69) gt—34 6 = (eee 


8.46. For each matrix in the preceding problem, determine those values of ¢ for which the determinant 


_ is zero. 
Le Demo £3 Pa Ab EBT 
: . a fal! O See ae OQ ih =e 
8.47. Evaluate the determinant of each matrix: (i) By etc eee (ii) rae oe rer 
437207 2 PA 7h ah 
COFACTORS, CLASSICAL ADJOINTS, INVERSES 
i Se 8 
8.48. For the matrix oe ae eps , find the cofactor of: 
A (8 Bee al 
Tb Of 8h = 


(i) the entry 4, (ii) the entry 5, (iii) the entry 7. 


font? <6 
Rage iets Al = |-1 1-1 Find (i) adj A, (ii) A~}. 
0 1 
1 2 
$50... Let A = (13 0 Find (i) adj A, (ii) AW}. 
1 1 


8.51. Find the classical adjoint of each matrix in Problem 8.47. 
8.52. Determine the general 2 by 2 matrix A for which A = adj A. 


8.53. Suppose A is diagonal and B is triangular; say, 
ay, 0 aieite 0 by C12 stele Cin 


ee 
eleianetokelelele sl.©, 4 e108 eter fn ee od Fal eee 


194 


DETERMINANTS [CHAP. 8 


(i) Show that adj A is diagonal and adj B is triangular. 
(ii) Show that B is invertible iff all 6; ~ 0; hence A is invertible iff all a; #0. 


(iii) Show that the inverses of A and B (if either exists) are of the form 


Gok Owe Bre O Oe dig cee Rene 
Vetere he pee ? B-1 0 by” dyn 
OF eon erg ee 


That is, the diagonal elements of A~! and B~! are the inverses of the corresponding diagonal 
elements of A and B. 


DETERMINANT OF A LINEAR OPERATOR 


8.54. 


8.56. 


8.57. 


Let T be the linear operator on R? defined by 


Day y, 2) = (oe — 22) Syl lend tka 2) 
Find det (7). 


Let D:V-V be the differential operator, i.e. D(v) = dv/dt. Find det (D) if V is the space gen- 
erated by (i) {1,t, ..., t™},. (ii) {et, e2¢, e384}, (iii) {sin t, cos t}. 


Prove Theorem 8.12: Let T and S be linear operators on V. Then: 
(i) det (So T) = det (S)*det(T); (ii) T is invertible if and only if det(T) #0. 


Show that: (i) det(ly) =1 where 1y is the identity operator; (ii) det(T~1) =det(T)~! if T is 
invertible. 


DETERMINANTS AND LINEAR EQUATIONS 


3 5y = 8 = =- 
8.58. Solve by determinants: (i) Pare (ii) epee ; : 
Apis 20a Ate 
PEO IS) Pp Ve PEG SB) BD) ae By 
8.59. Solve by determinants: (i) BD oe VAN —— OS ae! (Ch) ae Se = 2 Sab. 
Bip LH) —oyd nS BYP SS 2 aie 
8.60. Prove Theorem 8.10: The homogeneous system Ax =0 has a nonzero solution if and only if 
A=|A|=0. 
PERMUTATIONS 
8.61. | Determine the parity of these permutations in S;: (i) o = 32154, (ii) r=13524, (iii) 7 =425831. 
8.62. For the permutations o, r and z in Problem 8.61, find (i) r°o, (ii) roo, (iii) o~1, (iv) 771. 
8.63. Let +€S,. Show that r°o runs through S, as o runs through S,; that is, S, = {r°c: o €S,}. 
8.64. Let o€S, have the property that o(n) =n. Let o* © S,_, be defined by o*(x) = (x). (i) Show 
that sgno* = sgno. (ii) Show that as o runs through S,, where o(n) =n, o* runs through 
S,-13 that is, S,-, = {o*: .¢ © S,, o(n) =n}. 
MULTILINEARITY 
8.65. Let V=(K™)m, i. V is the space of m-square matrices viewed as m-tuples of row vectors. Let 


IDFR N mse 15G, 
(i) Show that the following weaker statement is equivalent to D being alternating: 


D(A, Ao, eos A,) = 0 
whenever A; = A;,, for some %. 


(ii) Suppose D is m-linear and alternating. Show that if A,,A9,...,A,, are linearly dependent, 
then® D(AG, 3.545): = 0; 


CHAP. 8} DETERMINANTS 195 


8.66. 


8.67. 


c 
D:V>R is 2-linear (with respect to the rows) if (i) D(M) = ac—bd, (ii) D(M) = ab — cd, 
(iii) D(M) = 0, (iv) D(M) = 1. 


O20 
Let V be the space of 2 by 2 matrices M = ( .) over R. Determine whether or not 


Let V be the space of n-square matrices over K. Suppose BE€V is invertible and so det(B) #0. 
Define D:V>K by D(A) = det (AB)/det (B) where A€V. Hence 
DASA Anes, det (AGB ;A Be hate A,,B)/det (B) 


where A; is the ith row of A and so A,B is the ith row of AB. Show that D is multilinear and 
alternating, and that D(I)=1. (Thus by Theorem 8.13, D(A) =det(A) and so det(AB) = 
det (A) det (B). This method is used by some texts to prove Theorem 8.5, ie. |AB| = |A| |B.) 


MISCELLANEOUS PROBLEMS 


8.68. 


8.69. 


8.74. 


Let A be an n-square matrix. Prove |kA| = k"|A|. 


Prove: dtiear, at aie “e 
2 
eee eh = (1 @— 2) 
af Die a orase Bl aba Watolle Seu oRene i<j 
Tice, hae. pea 


The above is called the Vandermonde determinant of order n. 


AUSB 
Consider the block matrix M = 0 a where A and C are square matrices. Prove |M| = |A||C|. 
More generally, prove that if M is a triangular block matrix with square matrices A,,...,A,, on 


the diagonal, then |M| = |A,| |A,| ---|A,,|. 
Let A,B,C and D be commuting n-square matrices. Consider the 2n-square block matrix 


M = ie D) Prove that |M| = |A||D| — |B||Cl. 


Suppose A is orthogonal, that is, AtA =J. Show that |A| = +1. 


Consider a permutation o = jyjo... jn. Let {e;} be the usual basis of K", and let A be the matrix 
whose ith row is ¢;, ie. A = (2),,€j.) +++ e;,)» Show that |A| = sgn o. 


Let A be an n-square matrix. The determinantal rank of A is the order of the largest submatrix 
of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the 
determinantal rank of A is equal to its rank, i.e. the maximum number of linearly independent rows 


(or columns). 


Answers to Supplementary Problems 
(i)*—18;- (ii) —15. 
(i) #2 — 3t — 10, (ii) #2 — 2t — 8. 
Gi) #=6,¢=-2; (ii) t= 4,4 = —2. 


(i) 21, (ii) —11, (iii) 100, (iv) 0. 


196 


8.45. 


8.46. 


8.47. 


8.48. 


8.51. 


8.52. 


8.54. 


8.55. 


8.58. 


8.59. 


8.62. 


8.66. 


DETERMINANTS 


(i) (t+ 2)(t—38)(t—4), (ii) (6+ 2)2(¢—4), (iii) (€ + 2)?(€—4). 
@) e334 — 226 (it) 422 (aan)) Ae 
Gi) 135 (ai) — 55, 


(i) =135; (i) =103)" Gii)y 3k: 


-1-1 1 PME Roa: 
adj A = [<1 1-414, Act = (adf-A)/Al = ese 9 
2-2 0 Be a ee 
fe On=2 SO! 43 
Rd Abie aCe A Sat" eeG 
Series: Sod eB 
= 1602-29) 26" 2 21 14 AT 
bi 230. eRe 1G) aso9 ey ib eee ul 
OND rer Regie ema (i) S29 4 13. 2 
134), 140 28.518 iS 7219, =18 
k 0 
Aa 
ey 
det (T) = 4. 


(i) 0, (ii) 6, (iii) 1. 


(i) « = 21/26, y = 29/26; (ii) x = —5/13, y = 1/13. 


(i) «=5, y=1,z=1. (ii) Since A=0, the system cannot be solved by determinants. 


Sali Ss ean @ = Sl, Gane SS Se 


(i) r°o = 58142, (ii) 7°o = 52413, (iii) o— 1 = 32154, (iv) r~1 = 14253. 


(i) Yes, (ii) No, (iii) Yes, (iv) No. 


[CHAP. 8 


Chapter 9 


Eigenvalues and Eigenvectors 


INTRODUCTION 


In this chapter we investigate the theory of a single linear operator T on a vector 
space V of finite dimension. In particular, we find conditions under which T is diago- 
nalizable. As was seen in Chapter 7, this question is closely related to the theory of 
similarity transformations for matrices. 


We shall also associate certain polynomials with an operator T: its characteristic 
polynomial and its minimum polynomial. These polynomials and their roots play a major 
role in the investigation of T. We comment that the particular field K also plays an im- 
portant part in the theory since the existence of roots of a polynomial depends on K. 


POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 


Consider a polynomial f(t) over a field K: f(t) = ant" +--+ +ait +a. If A is a square 
matrix over K, then we define 
HA) = On A Gh te OA AS Od 


where IJ is the identity matrix. In particular, we say that A is a root or zero of the poly- 
nomial f(t) if f(A) = 90. 


Lin 2, 
Example 9.1: Let A = ( ) Peandeletea;(e tte Obie aneg (Cp — stb 2 ee Dien 


Be Gk 
LeN2 se ae 1 BO Vt (Sears 
GA a = 2(5 A = (5 a ai We 4) = Ca a 
Kea era TBO Nee gf T ON oy 009.0 
pad ote es fe ae 2(4 4 5 © i 
Thus A is a zero of g(t). 
The following theorem applies. 
Theorem 9.1: Let f and g be polynomials over K, and let A be an n-square matrix over K. 
Pere. S 
(i) (f+g)(A) = F(A) + ofA) 
(ii) (f9)(A) = f(A) 9(A) 
and, for any scalar k € K, 
(iii) (kf)(A) = k f(A) 


Furthermore, since f(t) g(t) = 9(t) f(t) for any polynomials f(t) and g(t), 
f(A) g(A) = 9A) F(A) 


That is, any two polynomials in the matrix A commute. 


197 


198 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


Now suppose 7:V-—>V is a linear operator on a vector space V over K. If f(t) = 
ant” +--+ + ait +a, then we define f(7) in the same way as we did for matrices: 


FCPS Ol Ger et ea ae on 


where J is now the identity mapping. We also say that T is a zero or root of f(t) if f(T) = 0. 
We remark that the relations in Theorem 9.1 hold for operators as they do for matrices; 
hence any two polynomials in T commute. 


Furthermore, if A is a matrix representation of 7, then f(A) is the matrix representation 
of f(T). In particular, f(T) =0 if and only if f(A) = 0. 


EIGENVALUES AND EIGENVECTORS 


Let 7:V->V bea linear operator on a vector space V over a field K. A scalar \EK 
is called an eigenvalue of T if there exists a nonzero vector v € V for which 


Every vector satisfying this relation is then called an eigenvector of T belonging to the 
eigenvalue \. Note that each scalar multiple kv is such an eigenvector: 


T(kv) =k T@) She) = ep) 
The set of all such vectors is a subspace of V (Problem 9.6) called the eigenspace of 2. 


The terms characteristic value and characteristic vector (or: proper value and proper 
vector) are frequently used instead of eigenvalue and eigenvector. 


Example 9.2: Let I1:V—>V_ be the identity mapping. Then, for every vEV, I(v) =v=1v. 
Hence 1 is an eigenvalue of J, and every vector in V is an eigenvector belonging to 1. 


Example 9.3: Let 7:R%—R? be the linear operator which 
rotates each vector v@€R2 by an angle 
6 = 90°. Note that no nonzero vector is a 
multiple of itself. Hence T has no eigen- 
values and so no eigenvectors. 


Example 9.4: Let D be the differential operator on the vector 
space V of differentiable functions. We have 
D(e>t) = 5e®t. Hence 5 is an eigenvalue of D 
with eigenvector e°!. 


If A is an n-square matrix over K, then an eigenvalue of A means an eigenvalue of A 
viewed as an operator on K". That is, 1G K is an eigenvalue of A if, for some nonzero 


(column) vector v € K”, 
AVE= 0 


In this case v is an eigenvector of A belonging to X. 


Example 9.5: Find eigenvalues and associated nonzero eigenvectors of the matrix A = rf 5) é 


We seek a scalar ¢ and a nonzero vector X = |) such that ‘AX = ¢X: 
Y 


ena) 


The above matrix equation is equivalent to the homogeneous system 


ia tre (€—-l)ha— 2y = 
or 
3a + 2y = ty —8x + (t—2)y = 


| 
° 


(1) 


| 
o 


CHAP. 9] EIGENVALUES AND EIGENVECTORS £99 


Recall that the homogeneous system has a nonzero solution if and only if the de- 
terminant of the matrix of coefficients is 0: 


— py as — = - = 
i Deaiae lta 9 2-8-4 = (f—4)(t+1) =>0 


p— 1 2 | 


Thus ¢ is an eigenvalue of A if and only if t=4 or t=—1. 
Setting t=4 in (1), 


Se = yy), = () Hae 3 9 
seen 6 or simply a“ —2y = 0 


“iy 2 
huss vy = ci = i is a nonzero eigenvector belonging to the eigenvalue t = 4, 


and every other eigenvector belonging to t= 4 is a multiple of v. 


Setting 3 —-—1 aina( a), 


\| 


ep Sy) 0 earl ah Sane 
Eos, 0 or simply Map nS 


x 1 
INCU WD) = ( ) = oe) is a nonzero eigenvector belonging to the eigenvalue 
t =—I1, and every other eigenvector belonging to t =—1 is a multiple of w. 


The next theorem gives an important characterization of eigenvalues which is fre- 
quently used as its definition. 


Theorem 9.2: Let T:V->V bea linear operator on a vector space over K. Then 1G K 
is an eigenvalue of T if and only if the operator AJ—T is singular. The 
eigenspace of A is then the kernel of AJ —T. 


Proof. » is an eigenvalue of T if and only if there exists a nonzero vector v such that 
Rij Or we Al) (0) (@) = ON) por (Al—T)(e)-= 0 


i.e. AJ—T is singular. We also have that v is in the eigenspace of A if and only if the above 
relations ho:d; hence v is in the kernel of J —T. 


We now state a very useful theorem which we prove (Problem 9.14) by induction: 


Theorem 9.3: Nonzero eigenvectors belonging to distinct eigenvalues are linearly 
independent. 


Example 9.6: | Consider the functions e%', e@‘, ..., e%' where a,,...,a@, are distinct real numbers. 


If D is the differential operator then D(e%') = a,e%'. Accordingly, ¢', ..., e%* 
are eigenvectors of D belonging to the distinct eigenvalues a1,...,4@,, and so, by 
Theorem 9.3, are linearly independent. 

We remark that independent eigenvectors can belong to the same eigenvalue (see 


Problem 9.7). 
DIAGONALIZATION AND EIGENVECTORS 


Let 7:V-V bea linear operator on a vector space V with finite dimension n. Note 
that T can be represented by a diagonal matrix 


o7eee ere ee ee eee 


200 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


if and only if there exists a basis {v1,...,Un} of V for which 


T(v1) = kivi 


fb) Teepe ciel ees fie on) 6) \e0 et) emer ome) tore 


that is, such that the vectors v1, ..., Un are eigenvectors of T belonging respectively to eigen- 
values ki, ..., kn. In other words: 


Theorem 9.4: A linear operator 7:V->V_ can be represented by a diagonal matrix B 
if and only if V has a basis consisting of eigenvectors of T. In this case 
the diagonal elements of B are the corresponding eigenvalues. 


We have the following equivalent statement. 

Alternate Form of Theorem 9.4: An n-square matrix A is similar to a diagonal matrix 
B if and only if A has n linearly independent eigen- 
vectors. In this case the diagonal elements of B are the 
corresponding eigenvalues. 


In the above theorem, if we let P be the matrix whose columns are the n independent 
eigenvectors of A, then B = P-1AP. 


Lee 
Sua, 


2 PAE TAL L/Seeelho 
eigenvectors @ and (2) Sete = (i vas) », and=so P71! = ee ee 
Then A is similar to the diagonal matrix 

VAS. al 
Bal pai fee LON AL 2N (LN AL ates 
3/50 2/5) 3 27 N31 Oe! 


As expected, the diagonal elements 4 and —1 of the diagonal matrix B are the eigen- 
values corresponding to the given eigenvectors. 


Example 9.7: Consider the matrix A = ( i By Example 9.5, A has two independent 


CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM 


Consider an n-square matrix A over a field K: 


ate O10 ee Qin 
7 ee G21 22 Q2n 
Ani An2 Ann 


The matrix tJ,—A, where I, is the n-square identity matrix and t is an indeterminant, is 
called the characteristic matrix of A: 


t— dit —Q12 —1n 
i es —d2 t—Ar —Qen 
ne oN Te ee 
Its determinant Aa(t) = det (tIn — A) 


which is a polynomial in ¢, is called the characteristic polynomial of A. We also call 
Aa(t) = det(tl,n—A) = 0 
the characteristic equation of A. 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 201 


Now each term in the determinant contains one and only one entry from each row and 
from each column; hence the above characteristic polynomial is of the form 


Aa(t) = (t—ar1)(t—az2)---(t— Onn) 


+ terms with at most  — 2 factors of the form t — ai 
Accordingly, 


Aa(t) = ® — (dir+@22+ +++ +@nn)t™~! + terms of lower degree 


Recall that the trace of A is the sum of its diagonal elements. Thus the characteristic 
polynomial Aa(t) = det (t2,— A) of A is a monic polynomial of degree n, and the coefficient 
of t”~* is the negative of the trace of A. (A polynomial is monic if its leading coefficient is 1) 


Furthermore, if we set t=0 in Aa(t), we obtain 
AA(0)8— |= Al —(=1)"|Al 


But Aa(0) is the constant term of the polynomial Aa(t). Thus the constant term of the char- 
acteristic polynomial of the matrix A is (—1)" |A| where v is the order of A. 


ib eS of) 
Example 9.8: The characteristic polynomial of the matrix A = (. 2 “ is 


PA pe 1 S R= Psp Be ce AS 


| OF tc 2 


AG) =  |—Al = 


As expected, A(t) is a monic polynomial of degree 3. 


We now state one of the most important theorems in linear algebra. 


Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic polynomial. 


2, 
Example 9.9: The characteristic polynomial of the matrix A = € 4 is 


=e Say et! 


A “=—_tT= Al = hea se 


ers: 
As expected from the Cayley-Hamilton theorem, A is a zero of A(t): 
ike NN a a} 0 0 O 
= = —A4 
Be ee) 1h ta) A lye) << hort) 
The next theorem shows the intimate relationship between characteristic polynomials 


and eigenvalues. 


Theorem 9.6: Let A be an n-square matrix over a field K. A scalar 1 €K is an eigen- 
value of A if and only if ) is a root of the characteristic polynomial A(¢) of A. 


Proof. By Theorem 9.2, A is an eigenvalue of A if and only if \I—A is singular. 
Furthermore, by Theorem 8.4, AJ—A is singular if and only if |\J —A|=0, ie. A is a root 
of A(t). Thus the theorem is proved. 


Using Theorems 9.3, 9.4 and 9.6, we obtain 


Corollary 9.7: If the characteristic polynomial A(t) of an n-square matrix A is a product 
of distinct linear factors: 


202 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


A(t) = (t—au)(t—a2)- + «(t= an) 


ie. if G1, ...,@n are distinct roots of A(t), then A is similar to a diagonal 
matrix whose diagonal elements are the ai. 


Furthermore, using the Fundamental Theorem of Algebra (every polynomial over C 
has a root) and the above theorem, we obtain 


Corollary 9.8: Let A be an n-square matrix over the complex field C. Then A has at 
least one eigenvalue. 


Sete Sy 
Example 9.10: Let A={|0 2 —5]. Its characteristic polynomial is 
07, b=2 
ta=ao 0 0 . 
iN (6) ee 0 tsai2 5 = (Uae a0) 
0 ly eae 


We consider two cases: 


(i) A is a matrix over the real field R. Then A has only the one eigenvalue 3. 
Since 3 has only-one independent eigenvector, A is not diagonalizable. 


(ii) A is a matrix over the complex field C. Then A has three distinct eigenvalues: 
3,7 and —2. Thus there exists an invertible matrix P over the complex field C 


for which 
Bree 0 tO 
Be A Pee Ose =tinaenO 
eo) a 


i.e. A is diagonalizable. 
Now suppose A and B are similar matrices, say B= P~!AP where P is invertible. We 
show that A and B have the same characteristic polynomial. Using tJ = P“'tIP, 
|él — Bl [Pe AP) ear Ci eae 
IP-\(t1—A)P| = [P| tf — A||P| 


II 


Since determinants are scalars and commute, and since |P~1||P|=1, we finally obtain 
[2 —B) = | —Al 
Thus we have proved 


Theorem 9.9: Similar matrices have the same characteristic polynomial. 


MINIMUM POLYNOMIAL 


Let A be an n-square matrix over a field K. Observe that there are nonzero polynomials 
f(t) for which f(A) =0; for example, the characteristic polynomial of A. Among these 
polynomials we consider those of lowest degree and from them we select one whose leading 
coefficient is 1, i.e. which is monic. Such a polynomial m(t) exists and is unique (Problem 
9.25); we call it the minimum polynomial of A. 


Theorem 9.10: The minimum polynomial m(t) of A divides every polynomial which has A 
as a zero. In particular, m(t) divides the characteristic polynomial A(t) of A. 


There is an even stronger relationship between m(t) and A(t). 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 203 


Theorem 9.11: The characteristic and minimum polynomials of a matrix A have the same 
irreducible factors. 


This theorem does not say that m(t) = A(t); only that any irreducible factor of one must 
divide the other. In particular, since a linear factor is irreducible, m(t) and A(t) have the 
same linear factors; hence they have the same roots. Thus from Theorem 9.6 we obtain 


Theorem 9.12: A scalar \ is an eigenvalue for a matrix A if and only if A is a root of the 
minimum polynomial of A. 


PMN DOG 

E le 911: Fi ini i i ni eae et 
xample 9.11: ind the minimum polynomial m(t) of the matrix A = Mane che 
OF Or 


The characteristic polynomial of A is A(t) = |t—A| = (t—2)3(t—5). By 
Theorem 9.11, both t—2 and t—5 must be factors of m(t). But by Theorem 9.10, 
m(t) must divide A(t); hence m(t) must be one of the following three polynomials: 


m(t) = (t—2)(¢—5), mt) = (¢—2)2(t—5), malt) = (€—2)8(t—5) 
We know from the Cayley-Hamilton theorem that m,(A) = A(A) = 0. The reader can 
verify that m,(A) #0 but m (A) =0. Accordingly, m.(t) = (t—2)?(t—5) is the 
minimum polynomial of A. : 


Example 9.12: Let A be a 3 by 3 matrix over the real field R. We show that A cannot be a zero 
of the polynomial f(t) = #+1. By the Cayley-Hamilton theorem, A is a zero of 
its characteristic polynomial A(t). Note that A(t) is of degree 3; hence it has at least 
one real root. 

Now suppose A is a zero of f(t). Since f(t) is irreducible over R, f(t) must be 
the minimal polynomial of A. But f(t) has no real root. This contradicts the fact 
that the characteristic and minimal polynomials have the same roots. Thus A is not 
a zero of f(t). 


The reader can verify that the following 3 by 3 matrix over the complex 
field C is a zero of f(t): 


eal) 
OR T0 
0 


CHARACTERISTIC AND MINIMUM POLYNOMIALS OF 
LINEAR OPERATORS 

Now suppose 7:V->V is a linear operator on a vector space V with finite dimension. 
We define the characteristic polynomial A(t) of T to be the characteristic polynomial of any 
~~ matrix representation of T. By Theorem 9.9, A(t) is independent of the particular basis in 
which the matrix representation is computed. Note that the degree of A(t) is equal to the 
dimension of V. We have theorems for T which are similar to the ones we had for matrices: 


Theorem 9.5’: T is a zero of its characteristic polynomial. 


Theorem 9.6’: The scalar \ €K is an eigenvalue of T if and only if A is a root of the 
characteristic polynomial of T. 


The algebraic multiplicity of an eigenvalue »}€K of T is defined to be the multiplicity 
of A as a root of the characteristic polynomial of T. The geometric multiplicity of the 
eigenvalue A is defined to be the dimension of its eigenspace. 


Theorem 9.13: The geometric multiplicity of an eigenvalue » does not exceed its algebraic 
multiplicity. 


204 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


Example 9.13: Let V be the vector space of functions which has {sin 8, cos 6} as a basis, and let D 
be the differential operator on V. Then 


D(sine) = cos@ = O(sine) + 1(cos4@) 
D(cos@) = —sin@g = -—1(sine) + O(cos 6) 
Ook 
The matrix A of D in the above basis is therefore A = [D] = @ Thus 
t =i 
det(tl—A) = ase ts eae 


and the characteristic polynomial of D is A(t) =#+1. 


On the other hand, the minimum polynomial m(t) of the operator T is defined independ- 
ently of the theory of matrices, as the polynomial of lowest degree and leading coefficient 1 
which has T as a zero. However, for any polynomial f(t), 

f(2)-=02 if and only if 27 (A) 0 


where A is any matrix representation of 7. Accordingly, T and A have the same minimum 
polynomial. We remark that all the theorems in this chapter on the minimum polynomial 
of a matrix also hold for the minimum polynomial of the operator 7. 


Solved Problems 
POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 


9.1. Find f(A) where A = ( es 


f(t) = #- 
j 5) and 100 2@—8t+7 


ree i 2 i 20) —3 -6 
= ves = _ = 
f(A) A 3A + TI e -) 3( ane = as ‘) 


2 3 


sittigy yk eS aN Bed a oe rae a 
f(A) = A2— 44 sr = (, = 465 > 5 (4 Tee be i 


9.3. Let V be the vector space of functions which has {sin 6, cos 9} as a basis, and let 
D be the differential operator on V. Show that D is a zero of f(t) = #+1. 


Apply f(D) to each basis vector: 
f(D)(sino) = (D2+J])(sine) = D%Xsine) + (sine) = —sine + sino = 0 
f(D)(cos0) = (D2+T1)(cose) = D%cose) + I(cose) = —cosea + cose = O 


1 4 
9.2. Show that A = is a zero of f(t) = ?—4t—5. 


Since each basis vector is mapped into 0, every vector v € V is also mapped into 0 by f(D). Thus 
DS OF 


This result is expected since, by Example 9.18, f(t) is the characteristic polynomial of D. 


CHAP. 9] 


EIGENVALUES AND EIGENVECTORS 205 


9.4. Let A be a matrix representation of an operator T. Show that f(A) is the matrix 


9.5. 


representation of f(T), for any polynomial f(0) 


Let ¢ be the mapping T+ A, i.e. which sends the operator T into its matrix representation A. 
We need to prove that ¢(f(T)) = f(A). Suppose f(t) = G@,t" + --- + ajt + a). The proof is by in- 
duction on n, the degree of f(t) 


Suppose ~=0. Recall that ¢(I’) =I where I’ is the identity mapping and I is the identity 
matrix. Thus 


H(F(T)) = P(aol’) = agg(I’) = al = f(A) 


and so the theorem holds for n = 0. 


Now assume the theorem holds for polynomials of degree less than n. Then since @ is an 
algebra isomorphism, 


o(f(T)) 


MOL hy 7 tat te Hy EP Aol’) 
On G(T) A T8-)) Flagg TO od oe aT + aol’) 


tO pA PNG, Ae He et -g Ase agl) =. f(A) 
and the theorem is proved. 


\| 


Prove Theorem 9.1: Let f and g be polynomials over K. Let A be a square matrix 
over K. Then: (i) (f + g)(A) = f(A) + 9(A); (ii) (fg)(A) = f(A) g(A); and (iii) (kf)(A) = 
kf(A) where KEK. 
Suppose f=—a,i%+---+a,t+a) and g =—}b,t™+---+6,t+ 5 9. Then by definition, 
fA) = a,Ar + °-- + aA + aol and g(A) = 6,A™ + ++: +6,A + bol 


(i) Suppose m=~n and let 6;=0 if 1 >m. Then 


Pse@ = (a, = b,,)t” + Sou + (ay + by)t + (ao + bo) 


Hence 


(f + g)(A) (An + bn)A” + +++ + (ay +0,)A + (Ao + bo) 


GAP b Artis + aA + b,A +109 + bol = f(A). +9(A) 


\| 


ntm 


(ii)- By definition, fg = Cyimt®?t?™+:-> tet+eo = 2 c,.tk where ¢, = dgob, + a,6,-1+ 


k ntm 


sts + a,b) = = a;b,_; Hence (fg)(A) = 2 c,A* and 


J 


n 
i= 


m ntm 


f(A) (A) = (3 aA* )( > 6A’) = abAt) = 3 qAk = (fg)A) 
i=0 J= = 


0 j=0 


(iii) By definition, kf = ka,t” + +--+ + kayt + kay, and so 
(kf)(A) = ka,A™ + +++ + kajA + kagl = k(a,A" +--+ +ayA + aol) = k f(A) 


EIGENVALUES AND EIGENVECTORS 


9.6. 


Let A be an eigenvalue of an operator 7:V>V. Let Vy denote the set of all eigen- 
vectors of T belonging to the eigenvalue X (called the eigenspace of ). Show that 
V, is a subspace of V. 


Suppose v,w GV); that is, T(v) =v and T(w) = \w. Then for any scalars a,b€ K, 
T(av+ bw) = aT(v) + bT(w) = a(rr) + bw) = (av + bw) 


Thus av + bw is an eigenvector belonging to i, ie. avt+bwe€ Vj. Hence V, is a subspace of V. 


206 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


i ane: | 
9.7. Let 4 =() 3 


(ii) Find an invertible matrix P such that P~'AP is diagonal. 


i (i) Find all eigenvalues of A and the corresponding eigenvectors. 


(i) Form the characteristic matrix tl —A of A: 
G20 ile: fit A, 
= — = = 1 
NCR ad ih a or =), (1) 


The characteristic polynomial A(t) of A is its determinant: 


(Bema Mater hen! 


= tk Abe (ee) (asi 
aie Gis (t—5)(t + 1) 


ING) == wh Al = 


The roots of A(t) are 5 and —1, and so these numbers are the eigenvalues of A. 


We obtain the eigenvectors belonging to the eigenvalue 5. First substitute t=5 into 

4 —4 
ers 
5 form the solution of the homogeneous system determined by the above matrix, i.e., 


4 —4\/% 0 4x —4y = 0 7 x Hi 
r= (@) — — 
ie 2 ee 0 a 24 + 2y = 0 S 
(In other words, the eigenvectors belonging to 5 form the kernel of the operator t1—A for 
t = 5.) The above system has only one independent solution; for example, x =1, y=1. Thus 


v= (j,i) is an eigenvector which generates the eigenspace of 5, i.e. every eigenvector belong- 
ing to 5 is a multiple of v. 


the characteristic matrix (1) to obtain the matrix ( ): The eigenvectors belonging to 


We obtain the eigenvectors belonging to the eigenvalue —1. Substitute ¢—=—1 into (1) 
to obtain the homogeneous system 


HH RA \ Ge 0 Sap = Ay =D) 42 0 
= “a = 
20 RAY) Wop 0 3 Spine Ay BEO> e 
The system has only one independent solution; for example, « = 2, y= —1. Thus w = (2,—1) 
is an eigenvector which generates the eigenspace of —1. 


Ty ei! 
B=P "iAP is the diagonal matrix whose diagonal entries are the respective eigenvalues: 


Wes 2S I 4 La? a) 
Bee Pe Ale = 
(i -ire)(23)(: 4) = (o +) 
(Remark. Here P is the transition matrix from the usual basis of R? to the basis of eigen- 
vectors {v,w}. Hence B is the matrix representation of the operator A in this new basis.) 


A ; : axe 
(ii) Let P be the matrix whose columns are the above eigenvectors: P = ( ip Then 


9.8. For each matrix, find all eigenvalues and a basis of each eigenspace: 


ees ore mS Say ee 
(i), (ACS 8 She 80 Gi) Bs eee ee 
6-6 4 Me 


Which matrix can be diagonalized, and why? 


(i) Form the characteristic matrix t1—A and compute its determinant to obtain the character- 
istic polynomial A(t) of A: 


basil Siete a8 
Al) = | Al S| Be teh 8 ne alate aa) 
6) 156 Ch mete 


The roots of A(t) are —2 and 4; hence these numbers are the eigenvalues of A. 


CHAP. 9] 


fie 


EIGENVALUES AND EIGENVECTORS 207 


We find a basis of the eigenspace of the eigenvalue —2. Substitute #——2 into the char- 
acteristic matrix tl — A to obtain the homogeneous system 


Sea Sees Nee, 0\ =3x + 37 = 32 —.0 
Se. SSS ila | SO or BY? ae SH By = or La Uiee: 90 
Leg gs aya hs 0 —bx + by —6z = 0 


The system has two independent solutions, e270. = 1,4 = 1,2—0 and «= 1,y—0,2=—1, 
ARMAS) ah == (GE) and v =(1,0,—1) are independent eigenvectors which generate the eigen- 
space Ofge-o: That is, u and v form a basis of the eigenspace of —2. This means that every 
eigenvector belonging to —2 is a linear combination of wu and v. 


We find a basis of the eigenspace of the eigenvalue 4. Substitute t=4 into the char- 
acteristic matrix tl—A to obtain the homogeneous system 


BoB 8 % 0 Oe a> w= ae =o) 
Sir ee ree IN id un “0 or —84 + 9y—8z = 0 or Na ae 
poe 8 0 \ a 0 —6a + 6y = 0 7A eee ass 8 


The system has only one free variable; hence any particular nonzero solution, eg. « =1, y =1, 
z= 2, generates its solution space. Thus w = (1,1,2) is an eigenvector which generates, and 
so forms a basis, of the eigenspace of 4. : 


Since A has three linearly independent eigenvectors, A is diagonalizable. In fact, let P 
be the matrix whose columns are the three independent eigenvectors: 


1 —2 0 0 
ate eas) wea ecliern ee TY sehen Ae 0 =2520 
=i 7 0 0 4 


As expected, the diagonal elements of P~1!1AP are the eigenvalues of A corresponding to the 
columns of P. 


; VA ot Neri 1 
NCE) bh ga) = a aro) it = (Use AAG = 2) 


The eigenvalues of B are therefore —2 and 4. 


We find a basis of the eigenspace of the eigenvalue —2. Substitute t=—2 into U1—B 
to obtain the homogeneous system 


ieee i x 0 ge ape = i 0 
oh a 
Meee AN Ray |e (0 oF le —'tyte2= 0 or | MY ‘ 
C1 = 
6.—6 O/\z 0 66) cif 4 


The system has only one independent solution, eg. x =1, y=1, z= Quen husiece—n (leo 10) 
forms a basis of the eigenspace of —2. 

We find a basis of the eigenspace of the eigenvalue 4. Substitute t=4 into tl IBF HO) 
obtain the homogeneous system 


Te ie BN 0 = Ui. % =O Se 

(pest eg. (I) 
TIE AE |e Sat or (Ge= Wear 6 = or na as, 
Grae Oe 94 Gi) Nie 0 62 °= by + 62) =°0 


The system has only one independent solution, e.g. % = Ory =i z= ly thus, *e—40, 1,4) 
forms a basis of the eigenspace of 4. 

Observe that B is not similar to a diagonal matrix since B has only two independent 
eigenvectors. Furthermore, since A can be diagonalized but B cannot, A and B are not similar 
matrices, even though they have the same characteristic polynomial. 


208 


ode 


EIGENVALUES AND EIGENVECTORS [CHAP. 9 


Let A = (; oF and. B= c he Find all eigenvalues and the corresponding 


eigenvectors of A and B viewed as matrices over (i) the real field R, (ii) the complex 
field C. 


(i) I) Wig Fc Ss 


ie es andl | Peat Apatode or (eee py 


alk teal 


Hence only 2 is an eigenvalue. Put t =2 into tl—A and obtain the homogeneous system 


—1 1\/x 0 —“a+t+y = 0 Ag 
= or or L—y — 20 
CeO a Gye es 


The system has only one independent solution,e.g. «=1, y=1. Thus v=(1,1) is an eigen- 
vector which generates the eigenspace of 2, i.e. every eigenvector belonging to 2 is a multiple 
of v. 


We also have oat 1" 


Apt = P= By = Sigh pa 


f= at 


Since ¢#?+1 has no solution in R, B has no eigenvalue as a matrix over R. 


(ii) Since A,(t) = (t—2)2 has only the real root 2, the results are the same as in (i). That is, 
2 is an eigenvalue of A, and _v = (1,1) is an eigenvector which generates the eigenspace of 2, 
i.e. every eigenvector of 2 is a (complex) multiple of v. 
The characteristic matrix of B is Ap(t) = |t-B| = t#2+1. Hence 7 and —7 are the eigen- 
values of B. 
We find the eigenvectors associated with t=7. Substitute t=7% in tl—B to obtain the 
homogeneous system 


pd OA oN eee (@—l)he +y = 0 ahs ‘ 
te: oath of Ems | OF) Me ee 


The system has only one independent solution,e.g. «=1, y=1—7 Thus w=(1,1-—7) is 
an eigenvector which generates the eigenspace of 2. 


| 


Now substitute ¢=-—7 into tl—B to obtain the homogeneous system 


peepee 1 ae at 0 Ey ihyeenwWi = 0 ; a} 
( -2 Pe Aled i ia Si Paks SVR Mae 


The system has only one independent solution, eg. «=1, y=1+72. Thus w’ =(1,1+7%) is 
an eigenvector which generates the eigenspace of —1. 


9.10. Find all eigenvalues and a basis of each eigenspace of the operator 7: R?> R? defined 


J 


by T(x, y, z) = (2e+y, y—2, 2y + 42). 


First find a matrix representation of T, say relative to the usual basis of R3: 


eae) 
AL inh =O mere St 
Oey 4 


A(t) = |tl-—A| = Omri Died =) (2262.3) 


Thus 2 and 8 are the eigenvalues of T. 


We find a basis of the eigenspace of the eigenvalue 2. Substitute t= 2 into tl — A to obtain 
the homogeneous system 


One () x 0 =) = 0 

ik y 0 
OL il y = tid) or Ope ee eS or 4 OK 
OL od Nes 0 te Fae eh lee 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 209 


9.11. 


9.12. 


9.13. 


9.14. 


The system has only one independent solution, e.g. «=1, y= 0,z2=0. Thus «w= (1,0,0) forms 
a basis of the eigenspace of 2. sb 


We find a basis of the eigenspace of the eigenvalue 3. Substitute t=3 into t1—A to obtain 
the homogeneous system 


1 -1 0 “9 0 ey = 0 
O22. 1 = = ape eo 
Ua —ria0 or PA ae = or we 
Q) =e 1 z 0 =Ly = BS By ate ee 
The system has only one independent solution, eg. «2 =1, y=1,z=-2. Thus v = (1,1,—2) 


forms a basis of the eigenspace of 3. 


Observe that T is not diagonalizable, since 7 has only two linearly independent eigenvectors. 


Show that 0 is an eigenvalue of T if and only if T is singular. 


We have that 0 is an eigenvalue of T if and only if there exists a nonzero vector v such that 
T(v) = 0v = 0, ie. that T is singular. 


Let A and B be n-square matrices. Show that AB and BA have the same eigenvalues. - 


By Problem 9.11 and the fact that the product of nonsingular matrices is nonsingular, the fol- 
lowing statements are equivalent: (i) 0 is an eigenvalue of AB, (ii) AB is singular, (iii) A or B is 
singular, (iv) BA is singular, (v) 0 is an eigenvalue of BA. 


Now suppose \ is a nonzero eigenvalue of AB. Then there exists a nonzero vector v such that 
ABv =v. Set w=Bv. Since } +0 and v +0, 


Aw = ABv = dv # 0 and so wr 0 


But w is an eigenvector of BA belonging to the eigenvalue i since 
BAW I— RAB — NU — Ne eNO 


Hence \ is an eigenvalue of BA. Similarly, any nonzero eigenvalue of BA is also an eigenvalue 
of AB. 


Thus AB and BA have the same eigenvalues. 


Suppose A is an eigenvalue of an invertible operator T. Show that A~! is an eigenvalue 
OL fe: 

Since T is invertible, it is also nonsingular; hence by Problem 9.11, \ #0. 

By definition of an eigenvalue, there exists a nonzero vector v for which TJ(v) =v. Apply- 
ing T~-1 to both sides, we obtain v= T7~1(Av) =d T-1(»), Hence. T-1(v) =r—1;, that is, \~} 
is an eigenvalue of T7—}. 


Prove Theorem 9.3: Let v1,...,Un be nonzero eigenvectors of an operator TV y, 
belonging to distinct eigenvalues A1,...,An. Then v1, ...,Un are linearly independent. 


The proof is by induction on n. If n=1,. then 2 is linearly independent since v, ~ 0. 


> 1. Suppose 
bets, Pe G0, + Gola tb = OU, = 0 (2) 
where the a; are scalars. Applying T to the above relation, we obtain by linearity 
a,T (v4) + AyT (V9) Tamenete + AnT (Vn) = T(0) =i) 


But by hypothesis T(v;) = 4%; hence 
AV + AgroVo a eos = AndAnUn 4) 


(2) 


210 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


On the other hand, multiplying (2) by ),, 
A1AnV1 ae AndnVo se ‘nas aS And\nVn = 0 (3) 
Now subtracting (3) from (2), 
(Ay — An)? 1 + (Ay — An)V2 ape 3 CGN net 7 An)Un—1 = 0 
By induction, each of the above coefficients is 0. Since the d; are distinct, \;—-A, ~0 for 1A xn. 
Hence a; = ++: =d,_,; = 0. Substituting this into (1) we get a,v, = 0, and hence a, = 0. Thus 
the v; are linearly independent. 


CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM 


9.15. Consider a triangular matrix 


Qi = Ai2 Qin 
Fee 0 22 Aan 
0 0 Ann 


Find its characteristic polynomial A(t) and its eigenvalues. 


Since A is triangular and tJ is diagonal, tl —A is also triangular with diagonal elements ¢ — aj: 


t— ay, —Ay9 —a1n 

Bh —={it 
ti — A= 0 22 2n 
0 0 1G. 


Then A(t) = |t1—A| is the product of the diagonal elements t — aj; 
A(t) = (€— a43)(t — dg9)+* + (t — Onn) 


Hence the eigenvalues of A are 44, do9, ..-,Qnn, i.e. its diagonal elements. 


ees 
9.16. Let A =|0 2 38). IsA similar to a diagonal matrix? If so, find one such matrix. 
ODOmaac 


Since A is triangular, the eigenvalues of A are the diagonal elements 1, 2 and 3. Since they 
are distinct, A is similar to a diagonal matrix whose diagonal elements are 1, 2 and 3; for example, 


9.17. For each matrix find a polynomial having the matrix as a root: 


Bah V2 GO Wnt eaten heal eats eens pe, 
@a=({ vay i) B= (7 ss (ih) OX = ONS Sk 
Oe ak 


By the Cayley-Hamilton theorem every matrix is a root of its characteristic polynomial. 
Therefore we find the characteristic polynomial A(t) in each case. 


G) ORG) = ae a Sloe ewes cere nay 
=I tito 

aa BS 

(ii) A(t) = |f1—B| = By 6S. Solge a p eegs 
—7 t+ 
al —4 3 


(iii) A(t) = |t—C| 


\| 
o 
+ 

| 
(Je) 
rn 


= (t—1)(#—2t—5) 


CHAP. 9] 


9.18. 


9.19. 


9.20. 


EIGENVALUES AND EIGENVECTORS 211 


Prove the Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic 
polynomial. 


Let A be an arbitrary n-square matrix and let A(t) be its characteristic polynomial; say, 
A(t) = |tl — A| — TU On Oe $ See + at + dy 


Now let B(t) denote the classical adjoint of the matrix t!—A. The elements of B(t) are cofactors 
of the matrix t1— A and hence are polynomials in t of degree not exceeding »—1. Thus 


B(t) = B,_,t?-1+4+ +--+ + Byt+ By 


where the B; are n-square matrices over K which are independent of t. By the fundamental property 
of the classical adjoint (Theorem 8.8), 

(tl—A)B(t) = |t1—A|I 
or (1 — A)(By_y?-14+---+Byt+ Bo) = (+a,-1t-14+--- +a,t+ ap) 


Removing parentheses and equating the coefficients of corresponding powers of t, 


CC ry 


Aol 


| 
aS 

= 
I 


Multiplying the above matrix equations by A”, A”™—1,...,A,J respectively, 


A"B,-1 ee Aue 
AIEEE eps PNA Er we ail eee a a 
An-2B, 3 —s An-1B,_» = libs An 2 
ABy = Re A2B, = a,A 
—ABy = Gol 


Adding the above matrix equations, 


r= An + G,-4A"—1 + paleo a a4,A ar Aol 


In other words, A(A) = 0. That is, A is a zero of its characteristic polynomial. 


Show that a matrix A and its transpose A‘ have the same characteristic polynomial. 
By the transpose operation, (t{1—A)*=tIt‘—At=tI—At. Since a matrix and its transpose 
have the same determinant, |tJ—A| = |(t2—A)*| = |t1—At|. Hence A and At have the same char- 


acteristic polynomial. 


An 9B 
Suppose M = ( a a where A; and Az are square matrices. Show that the char- 


acteristic polynomial of M is the product of the characteristic polynomials of Ai and 
Ao. Generalize. 
FE RS 5 
em ( an) eee 
|t{1—A||t1—B|, as required. 
By induction, the characteristic polynomial of the triangular block matrix 


eA —B 


): Hence by Problem 8.70, |{—M| = A ee 
— Ag 


212 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


Ws Pe C 
yaad tls 20) D 
0-10 Ay 


where the A; are square matrices, is the product of the characteristic polynomials of the A;. 


MINIMUM POLYNOMIAL 


2 Le vee) 
, be ts } O27 25 40270 
9.21. Find the minimum polynomial m(t) of A = Le en 
0 0-2 4 
The characteristic polynomial of A is 
$—2 —=1 0 0 
0 = ve 0 0 ae == Al il 
= = =e —5) (ea) eB 
ly 0 0 ¢t-1 -1 | 0 aa 2 | ae 
0 0 2 ic 


The minimum polynomial m(t) must divide A(t). Also, each irreducible factor of A(é), ie. ¢—2 
and ¢— 3, must be a factor of m(t). Thus m(t) is exactly one of the following: 


f(t) = (€—3)(€—2), g(t) = (t—38)(t— 2)", h(t) = (€—8)(¢—2)8 


We have 
ze yng WABeL tere Wii 2 ea 
0 tf 70570310, Ogee Oreo 
A) = (A-3N(A-2D) = ~ 0 
H(A) ( aN 4) Oa 70 =o cl Oe Oe eee 
OF 262 Gee k0 2 eee 
=I seis "Oy .0 \-0re tea Ola Oa 
0-1 6 O10 6 0. 0 
A) = (A—3N(A—212 = - 
9(A) ( 1) 1) 0! 0222084 hien 60nd : 
0 COL 285 40) Vides ee 


Thus g(t) = (¢—8)(t—2)2 is the minimum polynomial of A. 


Remark. We know that h(A) = A(A) = 0 by the Cayley-Hamilton theorem. However, the degree 
of g(t) is less than the degree of h(t); hence g(t), and not h(t), is the minimum polynomial of A. 


9.22. Find the minimal polynomial m(t) of each matrix (where a ~ 0): 


i fe 
; a 
— : i) B= eee — 
(i) (4 a (ii) 0 A a j5—(n) C oS ofeees 
Or 05s X 
O05 Osx 


(i) The characteristic polynomial of A is A(t) = (¢—A)2. We find A—2X ¥ 0; hence mt) = 
A(t) = (t— 2). 


(ii) The characteristic polynomial of B is A(t) = (t—)3. (Note m(t) is exactly one of t—A, (¢— A)? 
or (¢—A)%.) We find (B— dl)? #0; thus m(t) = A(t) = (t—2)3. 


(iii) The characteristic polynomial of C is A(t) = (t—d)4. We find (C—2D)3 £0; hence mt) = 
A(t) = (=a) 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 213 


9.23. 


9.24. 


9.25. 


Let M = 0 B where A and B are square matrices. Show that the minimum 


polynomial m/(t) of M is the least common multiple of the minimum 1 
polynomials g(t 
and h(t) of A and B respectively. Generalize. 4 


; : ay mA) 0 
Since m(t) is the minimum polynomial of M, m(M) = ( 0 og = 0 and hence m(A) = 0 
and m(B)=0. Since g(t) is the minimum polynomial of A, g(t) divides m(t). Similarly, h(t) 
divides m(t). Thus m({t) is a multiple of g(t) and A(t). 


KAY 0 0 
Now let f(t) be another multiple of g(t) and h(t); then f(M) = ( 0 a = Co a = 0. 


But m(t) is the minimum polynomial of M; hence m(t) divides f(t). Thus m(t) is the least common 
multiple of g(t) and h(t). 


We then have, by induction, that the minimum polynomial of 


an 0 
Ont AS 


where the A; are square matrices, is the least common multiple of the minimum polynomials of 
the A,. ; 


Find the minimum polynomial m/(t) of 


Ce Sy SS WN CS) 
Sy weve) TS OSS >) 
SS So ee (& © S&S 
SS S_— SS | 
QVes Si Sera aS 
CSS SS KS) 


a aN Cae ee cas DI (5 Th ini 1 ials of 
e “hoo , Saas a5, 0 ; = (5). e minimum polynomials o 


A, C and D are (t — 2), t? and t—5 respectively. The characteristic polynomial of B is 


= #2—7%+10 = (t—2)(t—5) 


Pines Day 
ja —- Bl = | | 


SI ee S 


and so it is also the minimum polynomial of B. 


ART OE0s 20 

OFS B02 20 : : rae 
Observe that M = glen € @ . Thus m(t) is the least common multiple of the minimum 

Oat Ome 


polynomials of A, B, C and D. Accordingly, m(t) = t2(€ — 2)2(t — 5). 


Show that the minimum polynomial of a matrix (operator) A exists and is unique. 


By the Cayley-Hamilton theorem, A is a zero of some nonzero polynomial (see also Problem 9.31). 
Let 1 be the lowest degree for which a polynomial f(t) exists such that f(A) =0. Dividing f(t) by 
its leading coefficient, we obtain a monic polynomial m(t) of degree » which has A as a zero. Sup- 
pose m/(t) is another monic polynomial of degree n for which m’/(A) =0. Then the difference 
m(t) — m’(t) is a nonzero polynomial of degree less than ” which has A as a zero. This contradicts 
the original assumption on ”; hence m(t) is a unique minimum polynomial. 


214 


9.26. 


9.27. 


9.28. 


9.29. 


EIGENVALUES AND EIGENVECTORS [CHAP. 9 


Prove Theorem 9.10: The minimum polynomial m(t) of a matrix (operator) A 
divides every polynomial which has A as a zero. In particular, m(t) divides the char- 
acteristic polynomial of A. 


Suppose f(t) is a polynomial for which f(A) =0. By the division algorithm there exist poly- 
nomials g(t) and r(t) for which f(t) = m(t) q(t) + r(t) and r(t) = 0 or deg r(t) < deg m(t). Sub- 
stituting t= A in this equation, and using that f(A) =0 and m(A)=0, we obtain r(A) = 0. If 
r(t) #0, then v(t) is a polynomial of degree less than m(t) which has A as a zero; this contradicts 
the definition of the minimum polynomial. Thus 7r(t) = 0 and so f(t) = m(t) q(t), i.e. m(t) divides f(t). 


Let m(t) be the minimum polynomial of an n-square matrix A. Show that the char- 
acteristic polynomial of A divides (m/(t))”. 
Suppose m/(t) = t7 + ¢,t7-1+-++: +0¢,_,t+ c,. Consider the following matrices: 
Beat 
Isa 2" AE Se Gal 
Bop. Ae te Ager eal 


Then B= 


er 


Also, — ALB ae = = CAG Cp Ale Se iene eta A) 
S16 Ih Gey 
= @ll 
Set BY) = t-1By 4. -2B, + <--> + 1B, EB, 
Then 


(i A))e BO) =" (E Bo ae Os SB ite aint a) a (Ce SAUD tert Ao get role Ane) 

= t'By, + t'-1(B,—AB,) + t'-2(B,—AB,) + --: + (B,_1—AB,_») — AB,-1 
HS ro ap ten ail ae POO See wl SE al 
= mt) 


The determinant of both sides gives |tI — A| |B(t)| = |\m(t) Z| = (m(t))". Since |B(t)| is a polynomial, 
|tI — A| divides (m(t))”; that is, the characteristic polynomial of A divides (m(t))”. 


Prove Theorem 9.11: The characteristic polynomial A(t) and the minimum poly- 
nomial m(t) of a matrix A have the same irreducible factors. 

Suppose f(t) is an irreducible polynomial. If f(t) divides m(t) then, since m(t) divides A(t), f(t) 
divides A(t). On the other hand, if f(t) divides A(t) then, by the preceding problem, f(t) divides 
(m(t))”. But f(t) is irreducible; hence f(t) also divides m(t). Thus m(t) and A(t) have the same 
irreducible factors. 


Let T be a linear operator on a vector space V of finite dimension. Show that 7 is 
invertible if and only if the constant term of the minimal (characteristic) polynomial 
of T is not zero. 

Suppose the minimal (characteristic) polynomial of Tis f(t) = t# +a,-jt7-1+ +++ + At + po. 
Each of the following statements is equivalent to the succeeding one by preceding results: (i) T is 
invertible; (ii) T is nonsingular; (iii) 0 is not an eigenvalue of T; (iv) 0 is not a root of m(t); (v) the 
constant term dp is not zero. Thus the theorem is proved. 


CHAP. 9} EIGENVALUES AND EIGENVECTORS 215 


9.30. 


Suppose dimV =n. Let T:V—>V be an invertible operator. Show that T7! is 
equal to a polynomial in T of degree not exceeding 7. 


Let m(t) be the minimal polynomial of T. Then mt) = + a,_,t7-14 +++ +a t+ ay, where 
r =n. Since T is invertible, aj ~ 0. We have 


mT) = TMOG Cold el Bea Rs SI 9 Mott aes axes fi = 0 
Hence 


1 
cei ne tainly <P + et ad)f =P sand .Tr1s= — (T1444, TT? + +++ +4,/) 
0 


MISCELLANEOUS PROBLEMS 


9.31. 


9.32. 


9.33. 


Let T be a linear operator on a vector space V of dimension n. Without using the 
Cayley-Hamilton theorem, show that T is a zero of a nonzero polynomial. 

Let N = n?. Consider the following N+1 operators on V: I, T, T2,..., TN. Recall that the 
vector space A(V) of operators on V has dimension N = 72. Thus the above N+1 operators are 


linearly dependent. Hence there exist scalars do, @,,...,@y for which ayTN + +--+ + a,T + aol =0. 
Accordingly, T is a zero of the polynomial f(t) = aytN + +--+ + ayt + ay. 


Prove Theorem 9.13: Let be an eigenvalue of an operator T7:V—>V. The geometric © 
multiplicity of does not exceed its algebraic multiplicity. 


Suppose the geometric multiplicity of \ is r. Then \ contains r linearly independent eigenvectors 


%1,...,U,. Extend the set {v,} to a basis of V: {v1,...,U,, Wy,..-,Ws}. We have 
T(vy3) = AM, 
Po) = AV 
Bop = Noy 


T(w,) = Ay1Vy qr OOO Se Qy,U, SF byw AP Oe Oo Se bisWs 
T (wo) = Ao1V1 Sp PSS Ge A9,U > ae bow FO aE bo.Ws 


eee emer error eee eres reese seseoeeee ee eres eerrersrevere 


h 0 0 G11 Ay sy 

Om 0 | G2 Ag9 Aso 

5 dacs He eS Eee pone iotere) sists 

0 0 A | Qi, Aor Qs Ay | -) 
i eae oe | = (--L- 

0 0 0 : BtoB Dr One 

0 0 0 | big bas re) 

ale, Me pod eee aE [hater seed el ye rate Sens, oes 

| 
OnT0 0 ; bi, bes es 


where A = (a,;)' and B = (6,;)*. 


By Problem 9.20 the characteristic polynomial of xI,, which is (£— 2)", must divide the char- 
acteristic polynomial of M and hence T. Thus the algebraic multiplicity of \ for the operator T is 


at least 7, as required. . 


ui 
Show that A = ( 


0 
The characteristic polynomial of A is A(t) = (t— 1)2; hence 1 is the only eigenvalue of A. We 


find a basis of the eigenspace of the eigenvalue 1. Substitute t= 1 into the matrix tl —A to obtain 
the homogeneous system 


i) is not diagonalizable. 


216 


9.34. 


9.35. 


EIGENVALUES AND EIGENVECTORS [CHAP. 9 


0 -1 0 -y = 
” = or J : or Vai) 
OM OV, 0 0 = 0 
The system has only one independent solution, e.g. «=1, y=0. Hence w= (1,0) forms a basis 
of the eigenspace of 1. 


Since A has at most one independent eigenvector, A cannot be diagonalized. 


Let F be an extension of a field K. Let A be an n-square matrix over K. Note that 
A may also be viewed as a matrix A over F. Clearly |e — A| = | — Al, that is;7A4 
and A have the same characteristic polynomial. Show that A and A also have the 
same minimum polynomial. 


A 
Let m(t) and m’/(t) be the minimum polynomials of A and A respectively. Now m’‘(t) divides 
every polynomial over F which has A as a zero. Since m(t) has A as a zero and since m(t) may be 
viewed as a polynomial over F’, m’(t) divides m(t). We show now that m(t) divides m’(¢). 


Since m’(t) is a polynomial over F' which is an extension of K, we may write 


m'(t) = f,() by ay f(t) bo Srp oer: fn (t) OF; 
where f;(t) are polynomials over K, and b,,...,0, belong to F and are linearly independent over K. 
We have 
mA), "== FatA) Ope fo (Ais ake ca aie (AsO al (1) 


Let aye denote the ij-entry of f,(A). The above matrix equation implies that, for each pair (i, 9), 
Chad Oe Op Dat ah Dal anO 
Since the b; are linearly independent over K and since the age € K, every Mae =0. Then 
fi (A) ES 0, f(A) = 0, sey Tiga) = 0 


Since the f,(t) are polynomials over K which have A as a zero and since m(t) is the minimum poly- 
nomial of A as a matrix over K, m(t) divides each of the f,(t). Accordingly, by (1), m(t) must also 
divide m’(t). But monic polynomials which divide each other are necessarily equal. That is, 
m(t) = m’(t), as required. 


Let {v1,..., Un} be a basis of V. Let T:V->V _ be an operator for which T(v1) = 0, 


T (v2) = Goiv1, T(vs) = @s1V1 + As2V2, ..., T (Un) = Gni¥1 + +--+ + An,n—1Un-1. Show that 
Ti): 
It suffices to show that 
Ti(v;) = 0 (*) 
for j=1,...,n. For then it follows that 
T™y,;) = Tr-i(Tiv;)) = T"-3(0) = 0, TOL Jas lense 1) 
and, since {v,,...,v,} is a basis, I” = 0. 


We prove (*) by induction on j. The case j =1 is true by hypothesis. The inductive step 
follows (for 7 =2,...,n) from 


Ti(v;) = Ti-1(T(w,)) = TI (G570p bs 2 ee ry a) 
FS Oia > POT) Pine i en Onn fea pte ey pecan) 
SGOT Gig 230 20 


Remark. Observe that the matrix representation of T in the above basis is triangular with 
diagonal elements 0: 


O ayy sy Wnt 
0 0 ago ang 
0 0 0 an. n— i 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 217 


Supplementary Problems 
POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 
Sao) eet f(t). = 9 BF +6 and g(t) = 8—2t+%t4+ 3. Find f(A), 9(A), f(B) and g(B) where 


Lf eB ee Rae 
Soe 0 and B Fale ae 
9.37. Let T : R? > R? be defined by T(x, y) = («+ y, 2”). Let. f(é) = 2 —2t+ 3. Find f(T) (ay). 


9.38. Let V be the vector space of polynomials v(x) = ax? + bu +c. Let D:V—>V be the differential 
operator. Let f(t) = #2 + 2t—5. Find f(D)(v(a)). 


1 
9.39. Let A = ( 44 . Find A?, A3, An, 


0 
he PAS LG) 

9.40. Let B = {0 8 12]. Find a real matrix A such that B = A3. 
0 oO 8 


9.41. Consider a diagonal matrix M and a triangular matrix N: 


Cie Ose Ae 0 Gin ile see te 
Mo = ORE Oo ae tece HO) and N= | 0 @& 
i ‘ORO Qn O © An 
Show that, for any polynomial f(t), f(M) and f(N) are of the form 
f(a) 0 ae 0 f(a) x RBS y 
f(M) = 0 i>) eee 0 snd f(N) = 0 GES) xe 4 
0 0 Fay) 0 0 fan) 


9.42. Consider a block diagonal matrix M and a block triangular matrix N: 


Pgs eet cs Oe a 
Mo = OR AS aca O and Nia ee 
pene he 0. 0 An 
where the A; are square matrices. Show that, for any polynomial f(¢), f(M) and f(N) are of the form 
FA 08 hee 0 eer ik 
yeahs | Ge 0 and f(N) = | © ee 
ee re eee a 


9.43. Show that for any square matrix (or operator) A, (P~1AP)" = P-1A"P where P is invertible. More 
generally, show that f(P—1AP) = P~1f(A)P for any polynomial f(t). 


9.44. Let f(t) be any polynomial. Show that: (i) f(A‘) =(f(A))*; (ii) if A is symmetric, i.e. At=A, 
then f(A) is symmetric. 


EIGENVALUES AND EIGENVECTORS 
9.45. For each matrix, find all eigenvalues and linearly independent eigenvectors: 


4 2 - 5 1 
(i) A =( ae (ii) B=(, aE (iii) c=(, ) 


Find invertible matrices P,, Pz and Ps; such that P> 1AP,, Po 1BP, and Py 1CP, are diagonal. 


218 


9.50. 


9.51. 


9.52. 


9.53. 


9.56. 


Dalle 


9.58. 


EIGENVALUES AND EIGENVECTORS [CHAP. 9 


For each matrix, find all eigenvalues and a basis for each eigenspace: 


Sh et ae LaZr s ee Des aMd: sel) 
(i) AN he Zr Th eet Ci ens Se 1 2) a CR One ale rad 
IE ed a Sie ale e Ofte Ohare 


When possible, find invertible matrices P;, P) and P3 such that Biaaks P,' BP, and PoeCrs are 
diagonal. ; 


ily a 1g) = 
values and linearly independent eigenvectors. 


2-1 3 =1 ‘ y 
Consider A = ( ) andiee— ( as matrices over the real field R. Find all eigen- 


Consider A and B in the preceding problem as matrices over the complex field C. Find all eigen- 
values and linearly independent eigenvectors. 


For each of the following operators 7:R2— R?, find all eigenvalues and a basis for each eigen- 
space: (i) T(x,y) = (8a+ 3y, a+ 5y); (ii) T(x,y) = (y,%); (ili) T(x, y) = (y, —#). 


For each of the following operators 7:R?— RR, find all eigenvalues and a basis for each 
eigenspace: (i) (aw, y, 2) = (@+y+2, 2y+2, 2y +32); (i) T@,y,2) = @+y,y +2, 29 —2); 
(Ana) OR, NY SS (GO Pa oe OW a Oe Pe OTD 


For each of the following matrices over the complex field C, find all eigenvalues and linearly 
independent eigenvectors: 


Leet areas: iL =e) a 
OV. |g) Gh slg a ioe GDC Se Vaart 192, 


Suppose v is an eigenvector of operators S and T. Show that v is also an eigenvector of the operator 
aS + bT where a and 0 are any scalars. 


Suppose v is an eigenvector of an operator T belonging to the eigenvalue X. Show that for n> 0, 
v is also an eigenvector of T” belonging to )”. 


Suppose A is an eigenvalue of an operator T. Show that f(\) is an eigenvalue of f(T). 


Show that similar matrices have the same eigenvalues. 


Show that matrices A and At have the same eigenvalues. Give an example where A and At have 
different eigenvectors. 


Let S and T be linear operators such that ST = TS. Let be an eigenvalue of T and let W be 
its eigenspace. Show that W is invariant under S, ie. S(W) CW. 


Let V be a vector space of finite dimension over the complex field C. Let W # {0} be a subspace 
of V invariant under a linear operator T:V—>V. Show that W contains a nonzero eigenvector of T. 


Let A be an n-square matrix over K. Let v;,...,v, € K" be linearly independent eigenvectors of 
A belonging to the eigenvalues ),,...,\,, respectively. Let P be the matrix whose columns are the 
vectors v1,...,U,. Show that P~-!AP is the diagonal matrix whose diagonal elements are the 
eigenvalues Ay,...,Ay- 


CHARACTERISTIC AND MINIMUM POLYNOMIALS 


9.60. 


For each matrix, find a polynomial for which the matrix is a root: 


2 3-2 
3-7 ve 
(i) a=(j ay cy aa ar Gi) C=10. 5 4 


CHAP. 9} EIGENVALUES AND EIGENVECTORS 219 


9.61. | Consider the n-square matrix 


bv Ke) 
KS 


Show that f(t) = (t—)" is both the characteristic and minimum polynomial of A. 


9.62. Find the characteristic and minimum polynomials of each matrix: 


25) O20, 20 ne! oe Ol Oana0) MOS OO: 

Oe OY a) OS 0 O20) re OP ree 

Ae NRO ew 492? lob ea ONO Sle Os). «dei | Os 0 ene Ors 0 

OOF eS ab20 Oe Os. Oey ial Or OE ae 0) 

OPO Sas Onan Onan OnnO Rt OROmmes ORO Ome O SRN 

Le 2 en O mu 

9.63. Let A = {0 2 0] and B=/0 2 2]. Show that A and B have different characteristic 
ORROse 1 OR Oras 7 
polynomials (and so are not similar), but have the same minimum polynomial. Thus nonsimilar 


matrices may have the same minimum polynomial. 


9.64. ; The mapping T:V-—>V defined by T(v) =kv is called the scalar mapping belonging to k€ K. 
Show that T is the scalar mapping belonging to k @ K if and only if the minimal polynomial of 
Tis mt) =t—k. 


9.65. Let A be an n-square matrix for which Ak =0 for some k >n. Show that A” = 0. 
9.66. Show that a matrix A and its transpose At have the same minimum polynomial. 


9.67. Suppose f(t) is an irreducible monic polynomial for which f(T) =0 where T is a linear operator 
T:V-Y. Show that f(t) is the minimal polynomial of T. 


A B tI—A -—B : 
9.68. Consider-a block matrix M = ( ): Show that t]—M = ( Sys) as the char- 


acteristic matrix of M. cD 


9.69. Let T be a linear operator on a vector space V of finite dimension. Let W be a subspace of V 
invariant under T, i.e. T(W)CW. Let Tw:W->W be the restriction of T to W. (i) Show that 
the characteristic polynomial of Ty divides the characteristic polynomial of T. (ii) Show that the 
minimum polynomial of Ty divides the minimum polynomial of T. 


M1 AQ Ay3 


9.70. Let A = | dg; do. dog, |. Show that the characteristic polynomial of A is 
a a a 
31 432 33 “ S : Dapiedia: sats 
G1, A192 14 13 22 23 
A(t) = 8 — (a4, + Gop + agg)t? + We t — | Qo, Gg2 93 
Go, Age Q31 33 Azo 33 
31 G32 33 


9.71. Let A be an n-square matrix. The determinant of the matrix of order n—m obtained by deleting 
the rows and columns passing through m diagonal elements of A is called a principal minor of degree 
n—m. Show that the coefficient of t” in the characteristic polynomial A(t) = |t/—A| is the sum 
of all principal minors of A of degree n—m multiplied by (—1)"-™, (Observe that the preceding 


problem is a special case of this result.) 


220 EIGENVALUES AND EIGENVECTORS [CHAP. 9 


9.72. Consider an arbitrary monic polynomial f(t) = #°+a@,-1t~1+ ++: +a ,t+ a). The following 
n-square matrix A is called the companion matrix of f(t): 


07.20 0 =a5 

1 0 0 amd 
By 2 tava 0 —a, 

0 0 It —An-1 


Show that f(t) is the minimum polynomial of A. 
9.73. Find a matrix A whose minimum polynomial is (i) t® — 5t2 + 6t + 8, (ii) 4 — 5t® — 2¢+ 7t + 4. 


DIAGONALIZATION 


a b 
iA Let A. — ¢ i) be a matrix over the real field R. Find necessary and sufficient conditions on 


a, b,c and d so that A is diagonalizable, i.e. has two linearly independent eigenvectors. 
9.75. Repeat the preceding problem for the case that A is a matrix over the complex field C. 


9.76. Show that a matrix (operator) is diagonalizable if and only if its minimal polynomial is a product 
of distinct linear factors. 


9.77. Let A and B be n-square matrices over K such that (i) AB =BA and (ii) A and B are both 
diagonalizable. Show that A and B can be simultaneously diagonalized, i.e. there exists a basis of 
kK" in which both A and B are represented by diagonal matrices. (See Problem 9.57.) 

9.78. Let H:V-—>V bea projection operator, ie. H2 = EF. Show that F# is diagonalizable and, in fact, 


Lo - 
0 where 7 is the rank of EF. 


can be represented by the diagonal matrix A = ( 0 


Answers to Supplementary Problems 


OMe —40 39 3a. 3 12 


9.37. f(T)(x,y)= (44 —y, —2m + 5y). 


9.38.  f(D)(v(x)) = —5aa? + (4a — 5b)x + (2a + 2b — 5c). 
S/T 8 plas re ie yp 
me eee 


PA ih (is 
9.40. Hint. Let A =|0 2 ec]. Set B=A®3 and then obtain conditions on a, b and c. 
Ome Ou 2 


9.44. (ii) Using (i), we have (f(A))t = f(At) = f(A). 


9:45. (i) ky = 1, ae = (25-1); Xe = 4, =p). 
(ii) Ay =1, wu = (2,—8); A. = 6, v = (1,1). 
(it) Nee a (lle) 
Pal phe al 
Let P,; = es *) anda2o5— ( a P3; does not exist since C has only one independent 
eigenvector, and so cannot be diagonalized. 


CHAP. 9] EIGENVALUES AND EIGENVECTORS 221 


9.46. 


9.48. 


9.49. 


9.50. 


9.51. 


9.56. 


9.57. 


9.58. 


9.59. 


9.62. 


9.73. 


9.77. 


(i) Ay = 2, u= (Gl Seal 0), Wo (1, 0, cat) 5 Ww) = 6, w= (ale 2, Ds 
Gi) 44 =38, w= (1,1,0), » = (10,1); 4p = 1,'w = 2, -1,1). 
(iii) A=1, u=(1,0,0), v = (O70; 1). 


a va lee | 1 yeahs epes” 
et ele 2) and) Pea= (1.20? 21. |, P; does not exist since C has at most two 
(St sal OFS tel 


linearly independent eigenvectors, and so cannot be diagonalized. 
(i) A= 38, w=(1,—-1); (ii) B has no eigenvalues (in R). 
3 — (11). (ii) = 24, w= (1,8 — 21); Ap = —2i, v = (1,8 + 2%). 


(i) Ay = 2, u = (8,—1); A. = 6, v = (1,1). (ii) 4y =1, w= (1,1); 4. = —-1, v = (1,—1). (iii) There 
are no eigenvalues (in R). 


Go Ay = 1, =.(1; 0,0); Ay = 40 a = (11,2). 
(ii) 4 =1, w= (1,0,0). There are no other eigenvalues (in R). 
(iii) Ay = 1, wu = (1,0,—1); A. = 2, v = (2, -2,—1); Ag = 8, w = (1, —-2,—1). 


(Gi) 2X =1, w= (1,0); 9 = 4, v = (14%. Gd a= 1, w= (450) Gi) “a, = 2, w= B,D} AY. = —2, 
v = (1,-1). (iv) 4y = 4, w= (2,1-1; 9 = 7, v = (2,149). 


iol 
Let A = ( 0 ) . Then X=1 is the only eigenvalue and v=(1,0) generates the eigenspace 
a oa) 
of } =1. On the other hand, for At = & i‘) , A\=1 is still the only eigenvalue, but w = (0,1) 


generates the eigenspace of \ = 1. 


Let v€ W, andso T(v) =dv. Then T(Sv) = S(Tv) = S(rv) = A(Sv), that is, Sv is an eigenvector 
of T belonging to the eigenvalue \. In other words, Su€@ W and thus S(W) CW. 


Let ‘i :W—->W be the restriction of T to W. The characteristic polynomial of ii is a polynomial 
over the complex field C which, by the fundamental theorem of algebra, has a root }. Then 2 is an 


eigenvalue of P, and so i has a nonzero eigenvector in W which is also an eigenvector of T. 
Suppose 7(v) = Av. Then (kT)(v) = kT(v) = k(x) = (kaA)v. 
(i) f(t) = 2@—8t+ 48, (ii) g(t) = 2 —8t+ 28, (iii) h(t) = @—6# + 5t—12. 


(i) A(t) = (—2)3(¢—17)2; m(t) = (t—2)%(¢—7). (ii) A) = (€— 8); mt) = (€—3)3. (lit) ACE) 
(t—2)5;  m(t) = t—». 


02 0" 405—4 

ime ee ize 0 = 

Use the result of Problem 9.72. (i) A= {1 0-6], (ii) A = eget anes 
eat Ome Oe thea 


Hint. Use the result of Problem 9.57. 


Chapter 10 


Canonical Forms 


INTRODUCTION 


Let T be a linear operator on a vector space of finite dimension. As seen in the preceding 
chapter, 7 may not have a diagonal matrix representation. However, it is still possible 
to “simplify” the matrix representation of 7 in a number of ways. This is the main topic 
of this chapter. In particular, we obtain the primary decomposition theorem, and the 
triangular, Jordan and rational canonical forms. 


We comment that the triangular and Jordan canonical forms exist for T if and only if 
the characteristic polynomial A(t) of T has all its roots in the base field K. This is always 
true if K is the complex field C but may not be true if K is the real field R. 


We also introduce the idea of a quotient space. This is a very powerful tool and will be 
used in the proof of the existence of the triangular and rational canonical forms. 


TRIANGULAR FORM 


Let T be a linear operator on an n-dimensional vector space V. Suppose 7 can be rep- 
resented by the triangular matrix 


Qi1 A12 Qin 
A xa A22 Qn 
Ann 


Then the characteristic polynomial of T, 
A(t) = |i — Al == (t — d11)(t — G22) sors (£— Gian) 


is a product of linear factors. The converse is also true and is an important theorem; 
namely, 


Theorem 10.1: Let T:V->V bea linear operator whose characteristic polynomial factors 
into linear polynomials. Then there exists a basis of V in which T is 
represented by a triangular matrix. 


Alternate Form of Theorem 10.1: Let A be a square matrix whose characteristic poly- 
nomial factors into linear polynomials. Then A is 
similar to a triangular matrix, i.e. there exists an in- 
vertible matrix P such that P-'!AP is triangular. 


We say that an operator T can be brought into triangular form if it can be represented 
by a triangular matrix. Note that in this case the eigenvalues of T are precisely those 
entries appearing on the main diagonal. We give an application of this remark. 


222 


CHAP. 10] CANONICAL FORMS’ 223 


Example 10.1: Let A bea square matrix over the complex field C. Suppose is an eigenvalue of A2, 


Show that Vd or —vVh is an eigenvalue of A. We know by the above theorem that 
A is similar to a triangular matrix 


by * Bet 
Bos ie ‘ 
Ln 
Hence A? is similar to the matrix 
ye * * 
2 * 
[BRL My ; 
M 


Since similar matrices have the same eigenvalues, \ = ue for some 7%. Hence 
w= VA or n;=—Vd; that is, Vx or — Yd is an eigenvalue of A. 


INVARIANCE 


Let CANS ees V be linear. A subspace W of V is said to be invariant under T or 
T-invariant if T maps W into itself, ic. if v GW implies T(v) © W. In this case T 


restricted to W defines a linear operator on W; that is, JT induces a linear operator T:W>Ww 
defined by T(w) = T(w) for every w € W. 


Example 10.2: Let 7 :R®—R? be the linear operator which rotates each vector about the z axis 
- by an angle 6: 
L(¢, y, 2) = (@ cos @ — y sin 6, « sino + y cos 6, 2) 


Observe that each vector w = 
(a,b,0) in the xy plane W remains 
in W under the mapping T, i.e. W 
is T-invariant. Observe also that 
the z axis U is invariant under T. 
Furthermore, the restriction of T 
to W rotates each vector about the 
origin O, and the restriction of T 
to U is the identity mapping on U. 


x 


Example 10.3: Nonzero eigenvectors of a linear operator T:V-—>V may be characterized as gen- 
erators of T-invariant 1-dimensional subspaces. For suppose TJ(v) =v, v#0. 
Then W = {kv,k€K}, the 1-dimensional subspace generated by v, is invariant 


under 7 because 
Tky) = kB) = kv) = kro © W 


Conversely, suppose dim U =1 and u#0 generates U, and U is invariant under 
T. Then T(u)€U and so T(u) is a multiple of u, ie. T(u) = wu. Hence u is an 


eigenvector of T. 


The next theorem gives us an important class of invariant subspaces. 


Theorem 10.2: Let 7:V->V be linear, and let f(t) be any polynomial. Then the kernel 
of f(T) is invariant under T. 


The notion of invariance is related to matrix representations as follows. 
Theorem 10.3: Suppose W is an invariant subspace of T:V > V. Then T has a block 
ANB. ; 
matrix representation ( 0 a where A is a matrix representation of 


the restriction of T to W. 


224 CANONICAL FORMS [CHAP. 10 


INVARIANT DIRECT-SUM DECOMPOSITIONS 
A vector space V is termed the direct sum of its subspaces W1,..., W,, written 
Vo= Wi OW @O--- OW, 
if every vector v € V can be written uniquely in the form 
US Wark We Ee wy with wi € Wi 


The following theorem applies. 


Theorem 10.4: Suppose W:,...,W, are subspaces of V, and suppose 


{wi, ara Win,}> sastteys {Wn, G6 ay Wrn,} 
are bases of W1,..., W, respectively. Then V is the direct sum of the 
W; if and only if the union {wu,..., Win, ..+)Wri,.-+,Wr,} 18 a basis 


of V. 


Now suppose 7:V->V_ is linear and V is the direct sum of (nonzero) T-invariant 
subspaces W,,..., W;: 


Vo ae Or WV. and TW) (CW gt tle 
Let 7; denote the restriction of T to W;. Then T is said to be decomposable into the operators 


T; or T is said to be the direct swm of the Ti, written T= 7:@--- ®T,. Also, the sub- 
spaces W,,..., W, are said to reduce T or to form a T-invariant direct-sum decomposition of V. 


Consider the special case where two subspaces U and W reduce an operator T:V->V/; 
say, dimU=2 and dimW=8 and suppose {21,2} and {wi, W2,ws3} are bases of U and 
W respectively. If 7; and Ts denote the restrictions of T to U and W respectively, then 


T2(wi) = b11Wi + Diew2 + bisw; 


Ti (U1) = Qi + Ai2ke 
u( y ek c T2(we) = b21W1 air DbooWs + bo3Ws3 


Ti (Uz) = zits + Azote 
i{t) T2(ws) = bs1wi + Dsexwe + b33ws 


Hence 
eae Dir bear a1 
TAN Se ee sa) and Bows Diz 22 De 
bis bas D3 


are matrix representations of T; and T2 respectively. By the above theorem {w1, U2, Wi, W2, Ws} 
is a basis of V. Since T(ui) = Ti(ui) and T(w)j) = T2(w;), the matrix of T in this basis is 


Ase0 
the block diagonal matrix ( 0 : 


A generalization of the above argument gives us the following theorem. 


Theorem 10.5: Suppose 7:V-V is linear and V is the direct sum of T-invariant sub- 


spaces W,,...,W,. If Ai is a matrix representation of the restriction of 
T to Wi, then T can be represented by the block diagonal matrix 
Ag: * ie ree 
eee 0 Age 0 
On 0 A, 
The block diagonal matrix M with diagonal entries A:,...,A, is sometimes called the 


direct sum of the matrices A1,...,A, and denoted by M = Ai®--- OA,. 


CHAP. 10] CANONICAL FORMS 225 


PRIMARY DECOMPOSITION 


The following theorem shows that any operator T:V>V is decomposable into oper- 
ators whose minimal polynomials are powers of irreducible polynomials. This is the first 
step in obtaining a canonical form for Jif 


Primary Decomposition Theorem 10.6: Let 7:V—>V be a linear operator with minimal 
polynomial 


m(t) = fi(t)™ fo(t)™ ... f-(t)™ 


where the fi(t) are distinct monic irreducible polynomials. Then V is the 


direct sum of T-invariant subspaces Wi, ...,W, where W; is the kernel of 
Ae Moreover, fi(t)" is the minimal polynomial of the restriction of 
O ie 


Since the polynomials f;(t)" are relatively prime, the above fundamental result follows 
(Problem 10.11) from the next two theorems. 


Theorem 10.7: Suppose 7:V-—V is linear, and suppose f(t) = g(t) h(t) are polynomials 
such that f(T) =0 and g(t) and h(t) are relatively prime. Then V is the 
direct sum of the T-invariant subspaces U and W, where U = Ker 9(T) 
and W = Kerh(T). 


Theorem 10.8: In Theorem 10.7, if f(t) is the minimal polynomial of T [and g(t) and h(t) 
are monic], then g(t) and h(t) are the minimal polynomials of the restric- 
tions of T to U and W respectively. 


We will also use the primary decomposition theorem to prove the following useful 
characterization of diagonalizable operators. 


Theorem 10.9: A linear operator T7:V-—>V has a diagonal matrix representation if and 
only if its minimal polynomial m(t) is a product of distinct linear 
polynomials. 


Alternate Form of Theorem 10.9: A matrix A is similar to a diagonal matrix if and only 
if its minimal polynomial is a product of distinct linear polynomials. 


Example 10.4: Suppose A #J is a square matrix for which A? =J. Determine whether or not 
A is similar to a diagonal matrix if A is a matrix over (i) the real field R, (ii) the 
complex field C. 

Since A? =I, A is a zero of the polynomial f(t) = #®—1=(t—1)(@+t+1). 
The minimal polynomial m(t) of A cannot be t—1, since A#J. Hence 
Hy) SS Gear ear ll OE Ti) =e = il 

Since neither polynomial is a product of linear polynomials over R, A is not diag- 
onalizable over R. On the other hand, each of the polynomials is a product of distinct 
linear polynomials over C. Hence A is diagonalizable over C. 


NILPOTENT OPERATORS 

A linear operator T:V —>V is termed nilpotent if T™ = 0 for some positive integer 7; 
we call k the index of nilpotency of Tif T*=0 but T*"' 40. Analogously, a square matrix 
A is termed nilpotent if A"=0 for some positive integer n, and of index k if A*=0 but 
Ak-140. Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is 
m(t) = t*; hence 0 is its only eigenvalue. 


The fundamental result on nilpotent operators follows. 


Theorem 10.10: Let 7:V->V _ be a nilpotent operator of index k. Then 7 has a block 
diagonal matrix representation whose diagonal entries are of the form 


226 CANONICAL FORMS [CHAP. 10 


0 0 = 0 
OP .OF aL 0 0 
NT ee RA cis ae ae ee 
0 0 0 OF 1 
Oe 070 00 


(i.e. all entries of N are 0 except those just above the main diagonal where 
they are 1). There is at least one N of order k and all other N are of orders 
=k. The number of N of each possible order is uniquely determined by 
T. Moreover, the total number of N of all orders is equal to the nullity 
Orn 
In the proof of the above theorem, we shall show that the number of N of order 7 is 
2m; — Mi+1 — Mi-1, Where m; is the nullity of TJ". 
We remark that the above matrix N is itself nilpotent and that its index of nilpotency is 
equal to its order (Problem 10.13). Note that the matrix N of order 1 is just the 1 x 1 zero 
matrix (0). 


JORDAN CANONICAL FORM 

An operator T can be put into Jordan canonical form if its characteristic and minimal 
polynomials factor into linear polynomials. This is always true if K is the complex field C. 
In any case, we can always extend the base field K to a field in which the characteristic 
and minimum polynomials do factor into linear factors; thus in a broad sense every operator 
has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan 
canonical form. 


Theorem 10.11: Let 7:V—>V _ be a linear operator whose characteristic and minimum 
polynomials are respectively 
ACEY = “(6 — Xi)" (Ee) ane 0 (Ben a) ae 
where the \; are distinct scalars. Then 7 has a block diagonal matrix 
representation J whose diagonal entries are of the form 


Wael =O 055.0 
Oe Nek AS Th Se Oa) 
Fig OPN AS ee er ae 
O° 20K On ye i ae eee 
© FOr IOS tear eae Ws 


For each ); the corresponding blocks Ji; have the following properties: 

(i) There is at least one Ji; of order mi; all other Ji; are of order = mi. 

(ii) The sum of the orders of the Ji; is ni. 

(iii) The number of Ji; equals the geometric multiplicity of Xi. 

(iv) The number of Ji; of each possible order is uniquely determined by T. 


The matrix J appearing in the above theorem is called the Jordan canonical form of the 
operator T. A diagonal block Ji; is called a Jordan block belonging to the eigenvalue }i. 
Observe that 


Nee LO 0 u 90 Ore) QoL Ue anes ee SU 
Orne t 0 0 xX 0-20 OFF Cebep O40 
Phish ook oh of aietel ork even’ soins Ver seems — Pie) shisise NotetpUelsome! ete Reten sits ae aaibis eHegs mer aluelie™ eh fer eMale yen aie 
O50 20 Md Or 0 ru 0 0. °0. 6 0.5 4 
Ue es 0 Xr O.20 0 Xr o—~0 0 0 0 


CHAP. 10} CANONICAL FORMS 227 


That is, 
Jj = XI+N 
where N is the nilpotent block appearing in Theorem 10.10. In fact, we prove the above 


theorem (Problem 10.18) by showing that 7 can be decomposed into operators, each the sum 
of a scalar and a nilpotent operator. 


Example 10.5: Suppose the characteristic and minimum polynomials of an operator T are respec- 
tively 
A(t) = (¢—2)4(¢—3)3 ‘cand. m(t) = (t—2)2(t—3)2 


Then the Jordan canonical form of T is one of the following matrices: 


ats la | 
Ome! Ne 
are ty fet 
ih a bie aes 
3 1! Soares 
| | 
Oss 
C=} - poe 
13 13 


The first matrix occurs if T has two independent eigenvectors belonging to its eigen- 
value 2; and the second matrix occurs if T has three independent eigenvectors be- 
longing to 2. 


CYCLIC SUBSPACES 


Let T be a linear operator on a vector space V of finite dimension over K. Suppose 
v€V and v#0. The set of all vectors of the form f(T)(v), where f(t) ranges over all 
polynomials over K, is a T-invariant subspace of V called the T-cyclic subspace of V gen- 
erated by v; we denote it by Z(v, 7) and denote the restriction of T to Z(v,T) by Tv. We 
could equivalently define Z(v,7T) as the intersection of all T-invariant subspaces of V 
containing v. 


Now consider the sequence 
SPL Ad 2D) eae ICN Ed BET 


of powers of T acting on v. Let k be the lowest integer such that T*(v) is a linear com- 
bination of those vectors which precede it in the sequence; say, 


TD) uO eae Orel (Y) Ooh 
Then Hohl yee ea eee ae tee ark is Obert Ao 


is the unique monic polynomial of lowest degree for which m.(T)(v) = 0. We call m,(t) the 
T-annithilator of v and Z(v, T). 


The following theorem applies. 
Theorem 10.12: Let Z(v,7), T. and m.(t) be defined as above. Then: 
(i) Theset {v, T(v),...,T*~1(v)} isa basis of Z(v, T); hence dim Z(v, T) = k. 
(ii) The minimal polynomial of T, is mv(t). 


(iii) The matrix representation of T. in the above basis is 


228 CANONICAL FORMS [CHAP. 10 


OF ORO, 0 —do 

1 O70 0 -a 
(ee OosmlS a0 Q —d2 

0 0 O tate OW — Os 

Oe AO oe il Sey 


The above matrix C is called the companion matrix of the polynomial m,(t). 


RATIONAL CANONICAL FORM 


In this section we present the rational canonical form for a linear operator T:V->V. 
We emphasize that this form exists even when the minimal polynomial cannot be factored 
into linear polynomials. (Recall that this is not the case for the Jordan canonical form.) 


Lemma 10.13: Let T:V-—>V _ bea linear operator whose minimal polynomial is f(t)" where 
f(t) is a monic irreducible polynomial. Then V is the direct sum 
Vo=] Z@u Toe OZ.) 
of T-cyclic subspaces Z(vi, T) with corresponding T-annihilators 
FO)" FO) = ay es Ue Me + =U, 
Any other decomposition of V into T-cyclic subspaces has the same number 
of components and the same set of 7-annihilators. 


We emphasize that the above lemma does not say that the vectors v; or the T-cyclic sub- 
spaces Z(vi, T) are uniquely determined by 7; but it does say that the set of T-annihilators 
are uniquely determined by T. Thus T has a unique matrix representation 


C1 


C2 


C, 
where the C; are companion matrices. In fact, the C; are the companion matrices to the 
polynomials f(t)”. 
Using the primary decomposition theorem and the above lemma, we obtain the following 
fundamental result. 
Theorem 10.14: Let T:V—>V _ bea linear operator with minimal polynomial 
mt) = filty™ fale)" vis fe 
where the fi(t) are distinct monic irreducible polynomials. Then T has a 
unique block diagonal matrix representation 


Cu 


Cir, 


Cay 


Ss 


where the Ci; are companion matrices. In particular, the Ci are the com- 
panion matrices of the polynomials f;(t)"* where 


m1 = N11 Ni2 a Niry, »++, Ms Ns Ns2 “Pt = Tage 


CHAP. 10] CANONICAL FORMS 229 


The above matrix representation of T is called i } } 
ed its rational canonical form. Th - 
nomials f;(t)"# are called the elementary divisors of T. : aa, 


Example 10.6: Let i be a vector space of dimension 6 over R, and let T be a linear operator whose 
minimal polynomial is m(t) = (#2—t+3)(t— 2). Then the rational canonical form 
of T is one of the following direct sums of companion matrices: 


(i) C(#—t+3) ® C(t—t+8) ® C((t—2)2) 

(ii) C(?—t+3) ® C((t—2)2) ® C((t—2)2) 

(iii) C(#®—t+3) @ C((t—2)2) @ C(it—-2) © Cit— 2) 
where C(f(t)) is the companion matrix of f(t); that is, 


0 —3 | 0-38 | 0-3! 
heen bal | 
Lg aed Daas Tyee d UE ye 
10 3 | 10 ery 5 
| | | Re 
cit AN ok phe hy ji 4d. 
;0 —4 10 —4 ra 
[esas [ee 
(ae ead i 4 12 


(i) (ii) (iii) 


QUOTIENT SPACES 


Let V be a vector space over a field K and let W be a subspace of V. If v is any vector 
in V, we write v+ W for the set of sums v+w with w € W: 


Vertaal) te Or aaa © 840 Vk) 


These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets 
partition V into mutually disjoint subsets. 


Example 10.7: Let W be the subspace of R? defined y yee yi 
by 
Ve = AG 0) 2G. = - W 
That is, W is the line given by the 
equation «—y=0. We can view 
v+W as a translation of the line, 
obtained by adding the vector v to 
each point in W. As noted in the 
diagram on the right, v + W is also cS 
a line and is parallel to W. Thus 
the cosets of W in R? are precisely 
all the lines parallel to W. 


In the next theorem we use the cosets of a subspace W of a vector space V to define a 
new vector space; it is called the quotient space of V by W and is denoted by V/W. 


Theorem 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of 
W in V form a vector space over K with the following operations of addi- 


tion and scalar multiplication: 
(i) (w+ W)+(vt+W) = (utv)+W 
(ii) kK(wt+W) = ku+W, where KEK. 
We note that, in the proof of the above theorem, it is first necessary to show that the 
operations are well defined; that is, whenever w+ W=w+W and v+W=v’'4+W, then 
(i) (utv)+W = (w+v’)+W and (ii) ku+W = ku’+W, forany ke K 


230 


CANONICAL FORMS [CHAP. 10 


In the case of an invariant subspace, we have the following useful result. 


Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V->YV. 


Then 7 induces a linear operator T on V/W defined by T(v+W) = 
T(v) + W. Moreover, if T is a zero of any polynomial, then so is T. 
Thus the minimum polynomial of T divides the minimum polynomial of T. 


Solved Problems 


INVARIANT SUBSPACES 


10.1. 


10.2. 


10.3. 


10.4. 


10.5. 


Suppose 7:V-V is linear. Show that each of the following is invariant under T: 

(i) {0}, (ii) V, (iii) kernel of 7, (iv) image of 7. 

(i) We have 7(0) = 0 € {0}; hence {0} is invariant under T. 

(ii) For every v€V, T(v) € V; hence V is invariant under T. 

(iii) Let w@KerT. Then T(u) = 0€ KerT since the kernel of T is a subspace of V. Thus 
Ker T is invariant under T. 


(iv) Since T(v)€@ImT for every v EV, it is certainly true if v€@ImT. Hence the image of 
T is invariant under T. 


Suppose {W;} is a collection of T-invariant subspaces of a vector space V. Show that 
the intersection W= 1;,W; is also T-invariant. 


Suppose v€ W; then v€ W, for every i. Since W; is T-invariant, T(v) © W, for every 1. 
Thus T(v) © W = 9,W, and so W is T-invariant. 


Prove Theorem 10.2: Let T:V-V be any linear operator and let f(t) be any poly- 
nomial. Then the kernel of f(T) is invariant under T. 


Suppose v € Ker f(T), i.e. f(T)(v) = 0. We need to show that T(v) also belongs to the kernel 
of f(T), ie. f(T)(T(v)) = 0. Since f(t)t=tf(t), we have f(T) T=T f(T). Thus 


T/T) = PHT)%) = TO) = 0 
as required. 


20 

1a 
First of all, we have that R? and {0} are invariant under A. Now if A has any other invariant 

subspaces, then it must be 1-dimensional. However, the characteristic polynomial of A is 

ieee 5 

| SA ie ap 


Find all invariant subspaces of A = ( viewed as an operator on R?. 


ING) ee baa w\ = = {eiopull 


Hence A has no eigenvalues (in R) and so A has no eigenvectors. But the 1-dimensional invariant 
subspaces correspond to the eigenvectors; thus R2 and {0} are the only subspaces invariant under A. 


Prove Theorem 10.3: Suppose W is an invariant subspace of 7:V>V. Then 7 


; AB 
has a block diagonal matrix representation 0 Cc where A is a matrix representa- 
tion of the restriction T of T to W. 


We choose a basis {w,,...,w,} of W and extend it to a basis {w,,.. Uo HO cand ay 
We have 


CHAP. 10] CANONICAL FORMS 231 


T(wy) = T(wy) = ayyw, + + 4,w, 
T (we) = T(w2) = agyw, + + dy/W, 
T(w,) = T(w,) = ApyWy bot + Ap Wy 


Pv) = by,0, + eee + by, W, + €y4¥, +8 + EYQv 
T(v9) = boywytee> + by,w, + Cg0, + +: + Cog 


T(v,) = byw, + +++ + b,,0, + e540, + 0+ + Ces 
But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system 


of equations. (See page 150.) Therefore it has the form & o where A is the transpose of 


the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of 
T relative to the basis {w,} of W. 


10.6. Let T denote the restriction of an operator 7 to an invariant subspace W, i.e. 
T(w) = T(w) for every wEW. Prove: 
(i) For any polynomial f(t), f(7)(w) = f(T)(w). 
(ii) The minimum polynomial of T divides the minimum polynomial of T. 


(i) If f(t)=0 or if f(t) is a constant, i.e. of degree 1, then the result clearly holds. Assume 
deg f = n> 1 and that the result holds for polynomials of degree less than n. Suppose that 


f(t) = a,t” + Geeta St Cae se ayt =e a 
a A A 
Then ACE) 0) =) (a. TPS a, Pal os aol) 
A aw aw 
= (a,T"—1)(T(w)) + (Qn—1 T"®-1 + +--+ + aol)(w) 
(a, T"—1)(T(w)) + (ay—1T"—1 + +++ + adol)(w) 
= f(T)(w) 

(ii) Let m(t) denote the minimum polynomial of T. Then by (i), m(P)(w) = m(T)(w) = 0(w) = 0 

for every w EW; that is, T is a zero of the polynomial m(t). Hence the minimum polynomial 

of T divides m(t). 


I| 


INVARIANT DIRECT-SUM DECOMPOSITIONS 
10.7. Prove Theorem 10.4: Suppose W:,...,W, are subspaces of V and suppose, for 
4=1,...,7 (Wu, Wm} 18-a basis of W:. Then V is the direct sum of the W; if 


and only if the union 

B= {Wu, » Win Wri Wrn } 
; : oe PEE Writ 
is a basis of V. 3 


Suppose B is a basis of V. Then, for any vEV, 


Dh = UF OR i um See + Ayn, Min, + et bh OyyWey ttt Fb On Wen = De = Cb} Si 2 OO ae ODE 
where w; = 4,1; +°** + Gin. Win, € Wi. We next show that such a sum is unique. Suppose 
v v 
Qe=) Wy Wot, Ea, where wj€ W; 


. . . A 
Since {wy, ...,Win,t is a basis of Wi, wi = byway t+ +++ + dinWin, and so 
O = byWy + + dyn Win, Her? + Bee + ot F On Wen, 
Since B is a basis of V, a; = 0;;, for each i and each j7. Hence w;=w; and so the sum for v 


is unique. Accordingly, V is the direct sum of the W;. 

Conversely, suppose V is the direct sum of the W;. Then for any v€& WW OS Tar OO? Se Oa 
where w;€ W;. Since {wit is a basis of W;, each w; is a linear combination of the wij, and so v 
is a linear combination of the elements of B. Thus B spans V. We now show that B is linearly 
independent. Suppose 

Qy1Wy, T **° a B1n,Win, Sp aaaist yg ego + Gyn rn, = 0 


232 


10.8. 


10.9. 


CANONICAL FORMS [CHAP. 10 


Note that ajwj + +++ + Gin Win, © W;. We also have that 0 = 0+0+-:++0 where 0€ W;. Since 
t 
such a sum for 0 is unique, 
QjyyWi + tt + Ain Wing = 0 OLE ta) einer te 
The independence of the bases (wis 3 imply that all the a’s are 0. Thus B is linearly independent 
and hence is a basis of V. 


Suppose 7:V->V is linear and suppose 7 = 7:®T7>2 with respect to a T-invariant 

direct-sum decomposition V = U@W. Show that: 

(i) m/(t) is the least common multiple of m(t) and m2(t) where m(t), mi(t) and me(t) 
are the minimum polynomials of T, T: and T2 respectively; 

(ii) A(t) = A,(t) Ao(t), where A(t), Ai(t) and A2(t) are the characteristic polynomials of 
T, T; and T2 respectively. 

(i) By Problem 10.6, each of m,(t) and m,(t) divides m(t). Now suppose f(t) is a multiple of both 


m,(t) and m,(t); then f(7,)(U)=0 and f(T,)(W)=0. Let vEV; then v =u+w with 
uE€GU and wEW. Now 


f(T)v = f(Thut fT)w = fKTywet fT2)w = 0+0 = 0 


That is, T is a zero of f(t). Hence m(t) divides f(t), and so m(t) is the least common multiple of 
m,(t) and m,(t). 


A 0 : 
(ii) By Theorem 10.5, T has a matrix representation M = ( 0 e where A and B are matrix 
representations of T, and T, respectively. Then, by Problem 9.66, 
thi A 0 
== el SU = ate All 2 85 ee Ane Ans 
ai) = \a—m| 0 ep] = METALIB] = anasto 


as required. 


Prove Theorem 10.7: Suppose 7:V-V is linear, and suppose f(t) = g(t) h(t) are 
polynomials such that f(T)=0 and g(t) and A(t) are relatively prime. Then V 
is the direct sum of the T-invariant subspaces U and W where U=Kerg(T) and 
W = Ker h(T). 


Note first that U and W are T-invariant by Theorem 10.2. Now since g(t) and h(t) are relatively 
prime, there exist polynomials v(t) and s(t) such that 


Atalay => Ga) Ss al 
Hence for the operator T, aC (Ca) etwas (Ln) nt 2s) ae — a (*) 
Let v © V; then by (*), v= nT) g(2) v= s( Dyk h)o 


But the first term in this sum belongs to W = Kerh(T) since 
MT) rT) g(Tye = 11) gQyAaryo =" Ht) oo = eer a0 
Similarly, the second term belongs to U. Hence V is the sum of U and W. 


To prove that V = U@W, we must show that a sum v = u+w with wEU, wEW, is 
uniquely determined by v. Applying the operator 7r(T)9(T) to v = u+w and using 9(T)u=0, 
we obtain 


MT) g(T)v = xr(T)o(T)u + r(T)g(T)w = r(T)o9(T)w 
Also, applying (*) to w alone and using h(T)w = 0, we obtain 


w = “T)g(T)w t+ s(T)A(T)w = r(T)g(T)w 


Both of the above formulas give us w=r(T)9(T)v and so w is uniquely determined by v. Sim- 
ilarly wu is uniquely determined by v. Hence V = U@W, as required. 


CHAP. 10] CANONICAL FORMS 233 


10.10. 


10.11. 


10.12. 


Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f(é) is the minimal poly- 
nomial of T (and g(t) and h(t) are monic), then g(t) is the minimal polynomial of the 


poe T; of T to U and h(t) is the minimal polynomial of the restriction T» of 
o W. 


Let m,(t) and m,(t) be the minimal polynomials of T, and T, respectively. Note that Gel) —20 
and h(T,;)=0 because U = Kerg(T) and W = Ker h(T). Thus 


my,(t) divides g(t) and = m,(t) divides h(t) (1) 


By Problem 10.9, f(t) is the least common multiple of m,(t) and m,(t). But m,(t) and m,(t) are 
relatively prime since g(t) and h(t) are relatively prime. Accordingly, f(t) = m,(t) m,(t). We also 
have that f(t) = g(t) h(t). These two equations together with (1) and the fact that all the polynomials 
are monic, imply that g(t) = m,(t) and h(t) = m,(t), as required. 


Prove the Primary Decomposition Theorem 10.6: Let T7:V—>V bea linear operator 
with minimal polynomial 
m(t) = fi(t)™ fo(t) ... fr(t)™ 


where the fi(t) are distinct monic irreducible polynomials. Then V is the direct sum 
of T-invariant subspaces W,,...,W, where W; is the kernel of fi(T)". Moreover, - 
fi(t)" is the minimal polynomial of the restriction of T to Wi. 


The proof is by induction on r. The case r =1 is trivial. Suppose that the theorem has been 
proved for r—1. By Theorem 10.7 we can write V as the direct sum of T-invariant subspaces W, 
and V, where W, is the kernel of f,(7T)"1 and where V, is the kernel of f,(T)™2...f,(T)™. By 
Theorem 10.8, the minimal polynomial of the restrictions of T to W, and V, are respectively f,()™ 
and f(t)" ... f,(t)™. 


Denote the restriction of T to V, by T,. By the inductive hypothesis, V, is the direct sum of 
subspaces W5,...,W, such that W; is the kernel of f;(7,)"| and such that f;(t)" is the minimal poly- 
nomial for the restriction of 7, to W;. But the kernel of f,(T)", for 1=2,...,r is necessarily 
contained in V, since f,(¢)" divides f,(t)™... f,(é)". Thus the kernel of f;(T)" is the same as the 
kernel of f,;(T,)":, which is W;. Also, the restriction of T to W; is the same as the restriction of T, 
to W, (for 1=2,...,7); hence f,(t)"% is also the minimal polynomial for the restriction of T to W;j. 
Thus V =~ W,®W.@-:: @W, is the desired decomposition of T. 


Prove Theorem 10.9: A linear operator 7:V—>V has a diagonal matrix representa- 
tion if and only if its minimal polynomial m(t) is a product of distinct linear 
polynomials. 


Suppose m/(t) is a product of distinct linear polynomials; say, 
W(t)! = “(= Aq) (C= No) ae (U— X,) 


where the i; are distinct scalars. By the primary decomposition theorem, V is the direct sum of 
subspaces W,,...,W, where W, = Ker(T—)j,J). Thus if v € Wj, then (T—rAD(v) = 0 or 


v 
T(v) = dyv. In other words, every vector in W; is an eigenvector belonging to the eigenvalue );. By 


Theorem 10.4, the union of bases for W,,..., W, is a basis of V. This basis consists of eigenvectors 


and so T is diagonalizable. 


Conversely, suppose 7 is diagonalizable, i.e. V has a basis consisting of eigenvectors of T. Let 
Ay, +--+, Ag be the distinct eigenvalues of T. Then the operator 


AD SAL HDL Shoal) os TF AD) 
maps each basis vector into 0. Thus f(T) =0 and hence the minimum polynomial m(t) of T divides 


the polynomial te) =e Ge MINES Ko) soe (t — dsl) 


Accordingly, m(t) is a product of distinct linear polynomials. 


234 


CANONICAL FORMS [CHAP. 10 


NILPOTENT OPERATORS, JORDAN CANONICAL FORM 


10.13. Let 
(i) 


(ii) 


(iii) 


(iv) 


T:V->V belinear. Suppose, for v EV, T*(v)=0 but T*1(v) #0. Prove: 
The set S = {v, T(v), ..., T*-1(v)} is linearly independent. 

The subspace W generated by S is T-invariant. 

The restriction T of T to W is nilpotent of index k. . 


Relative to the basis {7*~!(v),..., T(v), v} of W, the matrix of T is of the form 
On Lie 0 0 0 
OOS 0 0 
0 0 0 Os 
Oi. Ou 0 0 0 


Hence the above k-square matrix is nilpotent of index k. 


Suppose 

ad + @, Tv) dy T2vy oo Eh Gy PA) = 0 (*) 
Applying Tk—1 to (x) and using 7*(v) = 0, we obtain aTk—1(v) = 0; since Tk-1(v) #0, a=0. 
Now applying T*—2 to (*) and using JTk(v) =0 and a@=0, we find a,T*—1(v) =0; hence 
a, = 0. Next applying T*-3 to (*) and using T*(v) = 0 and a= a, = 0, we obtain 
a, Tk-1(v) = 0; hence a, =0. Continuing this process, we find that all the a’s are 0; hence 
S is independent. 


Let v€@ W. Then 


= 00 be (OE bo AU) ee eee (-D) 
Using Tk(v) = 0, we have that 
To) =, 0 Tv), + 6,72) +. -+- + 6.5 T*a1@) © W 


Thus W is T-invariant. 


By hypothesis T*(v) = 0. Hence, for 1=0,...,k-—1, 
Aw 
PM(Tiv)) = Teti(v) = 0 
A 
That is, applying 7'* to each generator of W, we obtain 0; hence Tk = 0 and so T is nilpotent 


of index at most k. On the other hand, Pk-1(y) = Tk-1(v) # 0; hence T is nilpotent of index 
exactly k. 


For the basis {T*—1(v), Tk—-2(v), ..., T(v), v} of W, 
T(Tk-1(v)) = Tv) = 0 
T(T-2(v)) = Tk-1(y) 


A 


T(TK~3(v)) = dene) 


@ 16.0; 0 16 [6/10 14/0260) 6/5) 'm 8 6 0 ve. Gh 0) ee) ©, © \8)/eyindehe: @) 0] ,6)0:\8.0e, ete) &, 6, niin) Glele talts.e) elie) 6) 


S 
~~ 
| 
= 
S 
~— 


0 0 

0 1 0 0 
Or Oban0 feel 
OF OeSO 0 0 


CHAP. 10] CANONICAL FORMS 235 


10.14. 


10.15. 


10.16. 


(i) Suppose «€U = KerTi, Then Ti(u) = 0 and i = i 
L 5 = BO Mart (Qs = at EO) == IHD) == On TNE 
u © Ker Tit! = W. But this is true for every u€U; hence UCW. . ; 


(ii) Similarly, if w€W = KerTi+!, then Tit1(w)=0. Th Tit1 (py) = Ti , 
: 2 ae = fl = = 
and so T(W) CU. om) us (w) *(T(w)) = T'(0) = 0 


Let qr: V>V be linear. Let X = Ker Ti-?, Y = KerT'-! and Z = Ker T%. By the 
preceding problem, XC YCZ. Suppose 


Ne emi re UT ate thea Od a aE VO bin hes thr, Wg, nee, Ver Wee -» Wt} 
are bases of X, Y and Z respectively. Show that 
We eret Ut, wer sti E At) ss Lats De) 
is contained in Y and is linearly independent. 


By the preceding problem, 7(Z)C Y and hence SCY. Now suppose S is linearly dependent. 
Then there exists a relation 


CR 4A BPS a We oo De SAO ae) FO. Saati) (0 


where at least one coefficient is not zero. Furthermore, since {u;} is independent, at least one of the 
b; must be nonzero. Transposing, we find 


Dap (tote Ife (00) aa ay hg ee a St ROR 
Hence TP HOP oa oe OPT (w,))L— 0 
Thus PWD a et Oy) = O.2-and-so. | bj W_p 3." +b, EY. =s Ker 74 


Since {u;,v;} generates Y, we obtain a relation among the w;, v; and w,; where one of the coefficients, 
i.e. one of the b,, is not zero. This contradicts the fact that {u;, v;, w,} is independent. Hence S 
must also be independent. 


Prove Theorem 10.10: Let T7:V—7V _ bea nilpotent operator of index k. Then T 
has a block diagonal matrix representation whose diagonal entries are of the form 
OiireO 0 0 
OUD nk: OnA0 
NT eat are ert ata nce aRs oh dhe Ie esl 
OS 6 ate 8 O.ert 
0 0 0 0 0 


There is at least one N of order k and all other N are of orders =k. The number of 
N of each possible order is uniquely determined by T. Moreover, the total number of 
N of all orders is the nullity of T. 

Suppose dimV = n. Let W, = KerT, W, = KerT?,...,W, = Ker Tk, Set m, = dim W;, 
for i=1,...,k. Since T is of index k, W,=V and W,-,;#V and so m_;< ™m =n. By 
Problem 10.17, 

W,CW.c:::CW,.= V 
Thus, by induction, we can choose a basis {u,,...,U%n} of V such that Why oe Um} is a basis of Wj. 


We now choose a new basis for V with respect to which T has the desired form. It will be con- 
venient to label the members of this new basis by pairs of indices. We begin by setting 


v(1, k) = Un 4 +. v(2, k) = Uns 4 +2) s+ e9 v(m, Sp AG k) = Um), 


236 


CANONICAL FORMS [CHAP. 10 


and setting 
(1, .k — 1) ='Te(1,-k);- v(2, k—1):= Tv(2; k), °. «., v(m, — my, 4, &— 1) = Tv(m, — Mm, — 1,.k) 
By the preceding problem, 
Sy Soran pansies Um, 9? yljk—-1), ..., vm, — m1, k —1)} 
is a linearly independent subset of W;,_,;. We extend S, to a basis of W,_, by adjoining new ele- 
ments (if necessary) which we denote by 
v(m, —m,—, +1,k—1), v(m, — m1 +2,k—1), ..-, V(m_-1— ™M-2, k— 1) 
Next we set 
v(1, k—2) = Tv(1, k—1), v(2,k—2) = Tv(2,k—1), ..., 
U(My—1 — M—2, k — 2) = Tv(m, 1 — M2, k — 1) 
Again by the preceding problem, 
So = {u, ee) Um), 9? OXils 1B a7) sey v(m, —1 — M,—2, k — 2)} 
is a linearly independent subset of W;,_2 which we can extend to a basis of W,_» by adjoining 
elements 
v(m, 1 — M2 +1, k—2), v(m, —1— M-2+2,kh—2), ..., V(m,_2— m3, k — 2) 
Continuing in this manner we get a new basis for V which for convenient reference we arrange 
as follows: 


v(1, k), Sean ehy v(m, — ™,—15 k) 

v(1, k—1), ser v(m, — M,—1, fe — 2); eee v(™M,—~1 — M9, k—-1) 

v(1, 2), oiey v(m, — Mr —15 2), che) v(m, —1 — Mt, —25 2), sree, v(m, —™, 2) 

v(1, 1), as og v(m, — ™M,—1) Ly euepety v(m, —1 — M,—9) ily. cxalaly Vv(Ms —™, 1B) sieuaty v(m, 1) 


The bottom row forms a basis of W,, the bottom two rows form a basis of Wo, etc. But what is 
important for us is that T maps each vector into the vector immediately below it in the table or into 
0 if the vector is in the bottom row. That is, ) 
i) lt j > 1 
Toi) cen OAT DEY F 
0 Ke GF) = al 
Now it is clear (see Problem 10.13(iv)) that T will have the desired form if the v(i,7) are ordered 
lexicographically: beginning with v(1,1) and moving up the first column to v(1,k), then jumping to 
(2,1) and moving up the second column as far as possible, etc. 
Moreover, there will be exactly 


My — Mp4 diagonal entries of order k 
(My—1 — M2) — (Mm, — M,_3) = 2m, —™M,— m,_2 diagonal entries of order k —1 
2m, — Mm, — Mz diagonal entries of order 2 
2m, — My diagonal entries of order 1 
as can be read off directly from the table. In particular, since the numbers m,,...,m, are uniquely 


determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, 
the identity 
mM = (m, — y=) atc (2m, 4 = Nae Mr —2) ap oO Ge (2m, —™m, — Ms) ata (2m, = Mg) 


shows that the nullity m, of T is the total number of diagonal entries of T. 


Opole te Oot On Olek 
OnrO es wheel ot 000 0 0 
10.17. Let A=, 0 0 0 0 0 Then) -A2p= 1. 0000 Oe Or20 and A?=0; 
0-0 010-0 0-0. 0 0:26 
00 0:0 0 6° 0 020 40 


hence A is nilpotent of index 2. Find the nilpotent matrix M in canonical form 
which is similar to A. 


CHAP. 10] CANONICAL FORMS 237 


Since A is nilpotent of index 2, M contains a diagonal block of order 2 and none greater than 
7. Note that rank A = 2; hence nullity of A=5—2=3. Thus M contains 3 diagonal blocks. 
Accordingly M must contain 2 diagonal blocks of order 2 and 1 of order 1; that is, 


0 1) 0 0 0 
POOR a0 <0) £50 

M = 0 gerue 1 4200) 
| 

Dyay een Ved 

016.8 700i 50 


10.18. Prove Theorem 10.11, page 226, on the Jordan canonical form for an operator T. 


By the primary decomposition theorem, T is decomposable into operators Tara Ligh alte 
T= T,@::: @T,, where (t—,)™ is the minimal polynomial of T;. Thus in particular, 


(By Syl S05 ein (TS Dn = 0 
Set N;=7,;—AJ. Then for i=1,...,r 


2 


Tf; = N,+ dil, where N™ = 0 


That is, 7; is the sum of the scalar operator \,J and a nilpotent operator N;, which is of index ™; 
since (t — );)™i is the minimal polynomial of T;. 


Now by Theorem 10.10 on nilpotent operators, we can choose a basis so that N; is in canonical 
form. In this basis, T; = N;+ ,J is represented by a block diagonal matrix M; whose diagonal 
éntries are the matrices J;,. The direct sum J of the matrices M; is in Jordan canonical form and, 
by Theorem 10.5, is a matrix representation of T. 


Lastly we must show that the blocks J;; satisfy the required properties. Property (i) follows 
from the fact that N; is of index m; Property (ii) is true since J and J have the same character- 
istic polynomial. Property (iii) is true since the nullity of N;=7,;—A,J is equal to the geometric 
multiplicity of the eigenvalue \;. Property (iv) follows from the fact that the T; and hence the N; 
are uniquely determined by T. 


10.19. Determine all possible Jordan canonical forms for a linear operator T:V—>V_ whose 
characteristic polynomial is A(t) = (¢— 2)%(t— 5)’. 


Since t —2 has exponent 3 in A(t), 2 must appear three times on the main diagonal. Similarly 
5 must appear twice. Thus the possible Jordan canonical forms are 


Os ae 2 2 | 
Taw ae 
ae 2 an 
2 le Mle aaa 
Ee Vee, es 
jie ian! jae? 
| | 5 | 5 
(i) (ii) (iii) 
eS aed 2 1! ee 
1 2a ee 
wag we oe at Salat sae) 
Ra ia ee eae Sh 
bib Te5 | ae 
ae = Boe LS te = 
| 


238 


10.20. 


‘10.21. 


10.22. 


10.23. 


CANONICAL FORMS [CHAP. 10 


Determine all possible Jordan canonical forms J for a matrix of order 5 whose 
minimal polynomial is m/(t) = (t—2)?. | 

J must have one Jordan block of order 2 and the others must be of order 2 or 1. Thus there 
are only two possibilities: 


yo man Wea 74 A 
| 
2 2 | 
soit ty Time Nk SB i ane ale a 
i | oF 1 | or dh = at 
| | 
(ees ze [es 
la 


Note that all the diagonal entries must be 2-since 2 is the only eigenvalue. 


ane SPACE AND TRIANGULAR FORM 


Let W be a subspace of a vector space V. Show that the following are equivalent: 
(i) wEv+W, (ii) u—v EW, (iii) vo eutW. 

Suppose u€@v+W. Then there exists wy) € W such that u=v+wo. Hence u—v = wy € W. 
Conversely, suppose u—v€W. Then u—v=wWy) where woG W. Hence u=v+wy € vt W. 
Thus (i) and (ii) are equivalent. 

We also have: u—vEW iff —(u—v) =v—uEcW iff veEu+W. Thus (ii) and (iii) are 
also equivalent. 


Prove: The cosets of W in V partition V into mutually disjoint sets. That is: 
(i) any two cosets w+ W and v+W are either identical or disjoint; and 
(ii) each v © V belongs to a coset; in fact, v Evu+ W. 


Furthermore, 1+ W=v+W if and only if w—v € W, andso (v+w)+W=v+W 
for any weé W. 


Let vE€ V. Since 0€ W, we have v=v+0€v+W which proves (ii). 


Now suppose the cosets 1+ W and v+W are not disjoint; say, the vector x belongs to both 
u+Wandv+W. Then u-—x2x€W and x«—vEW. The proof of (i) is complete if we show that 
utW=v+W. Let u+wy be any element in the coset ut W. Since w—x%,x—v and wy belong 


to W, 
(Ca i) — OP (C=) SR) Se hy Ei 


Thus u+w)€v+W and hence the coset u+ W is contained in the coset v+ W. Similarly v+ W 
is contained in w+ W andso u+W=v+W. 


The last statement follows from the fact that u+W—=v+W if and only if 1G@v+W, and 
by the preceding problem this is equivalent to uw—v€ W. 


Let W be the solution space of the homo- 
geneous equation 2%+3y+4z=0. De- 
scribe the cosets of W in R°. 

W is a plane through the origin O = (0,0,0), 
and the cosets of W are the planes parallel to W. 


Equivalently, the cosets of W are the solution sets 
of the family of equations 


PAD = BO) a G83 == Ip kKER 


In particular the coset v+ W, where v = (a,b,c), 
is the solution set of the linear equation 


20 + ay + 42 = 2a 3b -- 4c 
or ACA1)) a= aioe oO) ss (0) 


CHAP. 10] CANONICAL FORMS 239 


10.24. 
/ 


10.25. 
J 


10.26. 


10.27. 


Suppose W is a subspace of a vector space V. Show that the operations in Theorem 
10.15, page 229, are well defined; namely, show that if u+W=w'4+W and v+W= 
v’+W, then 
(i) (ut+v)+W = (uw’+v’)+W and (ii) kut+W = kw’+W, forany ke K 
(i) Since u+W=w+W and v+W =v'4+Ww both u—w’ and ’ bel 
? ral t WwW. B 
(utr) — (w+) = (u-—w) + w- vw’) EW. Hence (u+v) + W = tu! ae W. ane 


(ii) Also, since u—u’ © W implies k(u—w’) © W, th k 
? 3 —k , — poe) } 
hut W = kw +W. Sh BEES a ing) age BONS 


Let V be a vector space and W a subspace of V. Show that the natural map 7:V-V/W, 
defined by 7(v) =v+W, is linear. 
For any u,v &V and any ke K, we have 
nutv) = utvt+W = utwt+vstW = vu) + nv) 
and n(kv) = ku +W = k(v+W) = kn(v) 


Accordingly, 7 is linear. 


Let W be a subspace of a vector space V. Suppose {wi,...,wr} is a basis of W and 
the set of cosets {%1,...,0s}, where #;=v;+W, is a basis of the quotient space. 
Show that B = {v1,...,Vs, W1,...,Wr} is a basis of V. Thus dimV = dimW + 
dim (V/W). 
Suppose ue V. Since {0;} is a basis of V/W, 
= Ohare WY = GD Sr GOR ar ORS SP WR. 
Hence u=a,v,+-:-:+a,v,+w where w€ W. Since {w;,} is a basis of W, 
VL = GeO Ar O98 =P WO ae DOU ae SOO SED i. 
Accordingly, B generates V. 
We now show that B is linearly independent. Suppose 
Ce ae O88 FP OW. ar CLO ap oS ea CON == () (1) 
Then 60; #23. +e, = 0 = W 
Since {0;} is independent, the c’s are all 0. Substituting into (1), we find dyw, +--+: + d,w, = 0. 
Since {w,} is independent, the d’s are all 0. Thus B is linearly independent and therefore a basis 
of V. 


Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator 
T:V->V. Then T induces a linear operator T on V/W defined by T(v+W) = 
T(v) + W. Moreover, if T is a zero of any polynomial, then so is 7. Thus the mini- 
mum polynomial of 7 divides the minimum polynomial of T. 

We first show that T is well defined, ie. if wtW=v+W then Tiut+W)=TWv+wW). It 
utW-=v+W then u—v € W and, since W is T-invariant, T(u—v) = T(u) — T(v) € W. 
A dingly, < c 

eH PutW) = Tw) +W = Tr) +W = PetW) 
as required. 
We next show that T is linear. We have 
P(u+tW)+(v+W)) = Tutvt+W) = Tut +W = Tu) + Tr) + Ww 
= Tw +Wt+Tvt+W = TutW)t+Twt+W) 
and 
Tk(ut+ Wy = TMku+W) = Tlku) +W = kTu)+W = KTw)+W) = kTu+W) 


Thus T is linear. 


ee 


240 CANONICAL FORMS [CHAP. 10 


Now, for any coset u+ W in V/W, 


T(utW) = Tu +W = T(T(u))+W = T(T(u)+W) = T(Tut+W)) = Tut W) 


Hence 72= 72. Similarly 7" = 7 for any n. Thus for any polynomial 


f(t) = o,f + +? +a) = Sat, 
AT(ut+W) = f(T)u +wW = SaTiu)+w = La(Tiw+ Ww) 
= SaTiut+W) = SaTiut+w) = (SaT)\u+wW) = f(T)(u+ W) 


and so f(T) = f(T). Accordingly, if T is a root of f(t) then f(T) =O =W=f(T), ie. T is also a 
root of f(t). Thus the theorem is proved. 


10.28. Prove Theorem 10.1: Let 7:V-—>V bea linear operator whose characteristic poly- 
nomial factors into linear polynomials. Then V has a basis in which T is represented 
by a triangular matrix. 

The proof is by induction on the dimension of V. If dim V =1, then every matrix representa- 
tion of T is a 1 by 1 matrix which is triangular. 

Now suppose dimV=x> 1 and that the theorem holds for spaces of dimension less than n. 
Since the characteristic polynomial of 7 factors into linear polynomials, T has at least one eigen- 
value and so at least one nonzero eigenvector v, say T(v) = a,,;v. Let W be the 1-dimensional sub- 
space spanned by v. Set V = V/W. Then (Problem 10.26) dimV = dimV —dimW = n—1. Note 
also that W is invariant under 7. By Theorem 10.16, T induces a linear operator T on V whose 
minimum polynomial divides the minimum polynomial of T. Since the characteristic polynomial of 
T is a product of linear polynomials, so is its minimum polynomial; hence so are the minimum 
and characteristic polynomials of T. Thus V and T satisfy the hypothesis of the theorem. Hence, 


by induction, there exists a basis {a,...,0,} of V such that 
T() = Agydy 
T(3) = Agd_ + a3305 
T (8) = Gygdg + Angdg + +++ + Onn®y 
Now let vs,...,U, be elements of V which belong to the cosets v9,...,0, respectively. Then 
{V, Vp, ...,Uy} is a basis of V (Problem 10.26). Since T(y) = dot, we have 
T(®_) — dg. = 0 and so AKG Gh: (SS Vi 


But W is spanned by v; hence T(v2) — dgovo is a multiple of v, say 


T (Vg) — Gy9Vo = AoyV and so T(Vg) = AgyV + AgoVo 
SimilarlvatOrmc— woven ait, 
T(v;) = AjgVo — Aj3V3 — *°: — ayv, © W and so T(v;) = Qyv + GjgVg + >++ + ay, 
Thus TC) ee) 
T(¥2) = ov + AggVo 
T(v,) = QniY T Anos + *** F GnnVn 


and hence the matrix of T in this basis is triangular. 


CYCLIC SUBSPACES, RATIONAL CANONICAL FORM 

10.29. Prove Theorem 10.12: Let Z(v,T) be a T-cyclic subspace, Ty the restriction of T to 
Z(v,T), and my(t) = tk + ax—-ith"14+ +--+ + a0 the T-annihilator of v. Then: 

(i) The set {v, T(v), ..., T*~1(v)} is a basis of Z(v,T); hence dim Z(v, T) = k. 

(ii) The minimal polynomial of T, is m,(t). 

( 


iii) The matrix of T, in the above basis is 


CHAP. 10] CANONICAL FORMS 241 


10.30. 


10.31. 


OF O 0 —dao 
20 TQ 0 =a 
COTES, MDS Roe ae ene 
O25 .0>s-0 Ones 
OSO70 en 
(i) By definition of m,(t), Tk(v) is the first vector in the sequence v, T(v), T2(v),... which is a 
linear combination of those vectors which precede it in the sequence; hence the set 
B= {v, T(v),..., T¥-1(v)} is linearly independent. We now only have to show that Z(v, 7) = 


L(B), the linear span of B. By the above, Tk(v) € L(B). We prove by induction that 
Pr(v) © L(B) for every n. Suppose n>k and T"~-1(v) € L(B), ie. T"-1(v) is a linear com- 
bination of v,..., T*—1(v). Then Tv) = T(T"-1(v)) is a linear combination of AKO), sss LO), 
But Tv) € L(B); hence Tv) € L(B) for every n. Consequently f(7)(v) € L(B) for any 
polynomial f(é). Thus Z(v,7T) = L(B) and so B is a basis as claimed. 


(ii) Suppose m(t) = t8+ 6,_,t8-14 +--+ by) is the minimal polynomial of 7,. Then, since 


WE AQ, 1h). 
v,T) 0 = m(T,)(v) = m(T)(v) = Ts(v) + 6,-1T8—1(v) + +++ + bov 


Thus Ts(v) is a linear combination of v, T(v),..., T’—1(v), and therefore k=s. However, 
m,(T) =0 and so m,(T,) = 0. Then m/(t) divides m,(t) and so s=k. Accordingly k=s and 
hence m,(t) = m(t). 


(iii) T,(v) = T(v) 
T,(T(v)) = T?(v) 
T,{Tk-2(v)) = Tk-1(y) 
Beko = h(0))c ade) Saylor he) ay Ty) ag LAY) 


By definition, the matrix of T,, in this basis is the transpose of the matrix of coefficients of the 
above system of equations; hence it is C, as required. 


Let 7:V->V be linear. Let W be a T-invariant subspace of V and T the induced 
operator on V/W. Prove: (i) The T-annihilator of v € V divides the minimal poly- 
nomial of T. (ii) The T-annihilator of 6 € V/W divides the minimal polynomial of T. 


(i) The T-annihilator of v GV _ is the minimal polynomial of the restriction of T to Z(v,T) and 
therefore, by Problem 10.6, it divides the minimal polynomial of T. 


(ii) The T-annihilator of + € V/W divides the minimal polynomial of T, which divides the minimal 
polynomial of T by Theorem 10.16. 


Remark. In case the minimal polynomial of T is f(t)" where f(t) is a monic irreducible poly- 
nomial, then the T-annihilator of v©V_ and the T-annihilator of 0€V/W are of the form f(t)™ 


where m=n. 


Prove Lemma 10.13: Let 7:V->V bea linear operator whose minimal polynomial 

is f(t)" where f(t) is a monic irreducible polynomial. Then V is the direct sum of 

T-cyclic subspaces Zi= Z(vi,T), 1=1,...,7, with corresponding T-annihilators 
f()™, FO. 9 f(O™ = MN=m=- =m 


Any other decomposition of V into the direct sum of T-cyclic subspaces has the same 
number of components and the same set of T-annihilators. 


The proof is by induction on the dimension of V. If dimV =1, then V is itself T-cyclic and 
the lemma holds. Now suppose dimV>1 and that the lemma holds for those vector spaces of 


dimension less than that of V. 


242 


CANONICAL FORMS [CHAP. 10 


Since the minimal polynomial of T is f(t)", there exists v,@V such that f(T)"~1(v,4) #0; 
hence the T-annihilator of v, is f(t)". Let Z, = Z(v,,T) and recall that Z, is T-invariant. Let 
Y= V/Z, and let T be the linear operator on V induced by T. By Theorem 10.16, the minimal poly- 
nomial of 7 divides f(t)"; hence the hypothesis holds for V and T. Consequently, by induction, T is 


the direct sum of T-cyclic subspaces; say, 
V = Z(%,T) @ --- ® Z,,T) 
where the corresponding T-annihilators are f(t)", ..., f(t)", n =n. =-++: =, 


We claim that there is a vector v» in the coset 0, whose T-annihilator is f(t)", the T-annihilator 
of 3). Let w be any vector in 6. Then f(7)"2(w) € Z,. Hence there exists a polynomial g(t) for 


which 
f(T) (w) = g(T) (v1) (1) 


Since f(t)” is the minimal polynomial of 7, we have by (1), 
0 = f(T)" (w) = f(T)" 9(T) (v4) 
But f(t)" is the T-annihilator of v,; hence f(t)" divides f(6)"—"2 g(t) and so g(t) = f(t)™2 h(t) for 
some polynomial h(t). We set 
Vg = w— R(T) (v4) 


Since w—v_ = h(T) (v1) € Z;, Vo also belongs to the coset t). Thus the T-annihilator of v2 is a 
multiple of the T-annihilator of 3). On the other hand, by (1), 


f(T)" (v2) = f(T)"a(w — A(T) (v1)) = F(T)"2(w) — o(T) (ry) = 0 
Consequently the T-annihilator of v, is f(t)"2 as claimed. 
Similarly, there exist vectors v3,...,v,@V_ such that v;€@%; and that the T-annihilator of 
v; is f(t)", the T-annihilator of v;. We set 
Zo = Z(vo,T), ..., Zp = Z(v;, T) 


Let d denote the degree of f(t) so that f(t)": has degree dn;. Then since f(t)" is both the T-annihilator 
of v; and the 7-annihilator of ¥;, we know that 

{Vi, T(v;), Oran) Tort (v;)} and {Bi, T(0;), cacy Tan,—1 (v;)} 
are bases for Z(v;,T) and Z(%, T) respectively, for i=2,...,r. But V= Z(Vo, T) @::-@ 
Zz; T); hence z = 
{0o, OBTICE) ge Ta (82), eveileg Dp pigst9) Pane=4 (Dy)} 


is a basis for V. Therefore by Problem 10.26 and the relation Ti(s) = Ti(v) (see Problem 10.27), 


LU 45,» dog 1 PLAS (Oy) Vay toe wig LT Daly adap Ua neers AAD 
is a basis for V. Thus by Theorem 10.4, V = Z(v,,T) ® :-: ® Zv,, T), as required. 
It remains to show that the exponents n,,...,”, are uniquely determined by 7. Since d denotes 


the degree of f(t), 

dimV = d(n,++* +n,) and dimZ, = dn, it=1,...,7r 
Also, if s is any positive integer then (Problem 10.59) f(T)s(Z,;) is a cyclic subspace generated by 
f(T)s(v,) and it has dimension d(n;—s) if n; > s and dimension 0 if n; =s. 


Now any vector v © V can be written uniquely in the form v = w,+-+-+w, where w; & Z;. 
Hence any vector in f(T)s(V) can be written uniquely in the form 


FT (e) = KT)F wi) se f(L) 80) 
where /(T)s(w;) € f(T)5(Z;). Let t be the integer, dependent on s, for which 


Nie Sy) ees ile aS, es 
Then TPE) =f) (Ay Bo TT (Ze) 
and so dim (f(7)8(V)) "=. sd lay 8) eos 28 (ais) (*) 


The numbers on the left of (*) are uniquely determined by 7. Set s=n—1 and (*) determines the 
number of n; equal to n. Next set s=n—2 and (*) determines the number of nm, (if any) equal to 
n—1. We repeat the process until we set s=0 and determine the number of n, equal to 1. Thus 
the n; are uniquely determined by T and V, and the lemma is proved. 


CHAP. 10] CANONICAL FORMS 243 


10.32. Let V be a vector space of dimension 7 over R, and let T:V>V_ bea linear operator 


with minimal polynomial m(t) = (+2)(t+3)3. Find all the possible rational 
canonical forms for T. 


The sum of the degrees of the companion matrices must add up to 7. Also, one companion 
matrix must be e+ 2 and one must be (t + 3)3. Thus the rational canonical form of T is exactly one 
of the following direct sums of companion matrices: 

(i) C(@+2) @ C(#+2) ® C((t + 3)3) 
(ii). C(#+2) @ C(t +3)3) @ C((t+3)2) 
(ili) C(?+2) ® C((¢+3)3) GB C(t +3) @ C(t+3) 


That is, 
0 -2 O22 0 -2 

Qr—2 0 0 —27 0 —27 

0 1 0 —-27 1 0-27 
0 0 —27 0 1 -9 fo 29 
1.0 +27 Oizo 3 
O° 1. =9 1 —6 is 

(i) (ii) (iii) 
PROJECTIONS 


10.33. Suppose V = Wi®---@OW,. The projection of V into its subspace W; is the map- 
ping E:V->V_ defined by E(v)=we where v=wit-:::+ur, wie Wi. Show 
that (i) F is linear, (ii) H? = E. 
(i) Since the sum v =w,+--:+w,, w,;€ W is uniquely determined by v, the mapping F is well 
defined. Suppose, for uGV, w= wyt:::+w), wiEW,. Then 
votu = (wyt+wi) t+ + (w,+w;) and kv = kwyt+-:++kw,, kw, w,+wi € W; 
are the unique sums corresponding to v+u and kv. Hence 
Ew+u) = wt+w, = E(v)+ E(u) and = E(kv) = kw, = kE(o) 


and therefore EF is linear. 


(ii) We have that NOs SS Ose ee ese We sek) sper ge 
is the unique sum corresponding to w;, € W;,; hence E(w,) = w,. Then for any v€V, 
Bev) = E(E(v)) = E(w) = we = Ev) 


Thus H2=E, as required. 


10.34. Suppose E:V > V is linear and H?=£E. Show that: (i) E(u) =u for any ueIm#, 
i.e. the restriction of E to its image is the identity mapping; (ii) V is the direct sum 
of the image and kernel of E: V = ImE © Ker £; (iii) E is the projection of V into 
Im EF, its image. Thus, by the preceding problem, a linear mapping Pea ey alot 
projection if and only if T?=T; this characterization of a projection is frequently 
used as its definition. 


(i) If w€Im#, then there exists v€@V for which E(v) =u; hence 


E(u) = E(E(v)) = Fv) = E(w) = u 
as required. 


(ii) Let v€V. We can write v in the form v = E(v) + v— E(v). Now E(v) €ImE and, since 
E(v—E(v)) = Ev) — BX(v) = Ev) — EO) =n 0) 
v—E(v) € Ker#. Accordingly, V = Im # + Ker E. 


244 


10.35. 


CANONICAL FORMS [CHAP. 10 


Now suppose w@€ImEnKerE. By (i), E(w) =w because w€ImZ£. On the other 
hand, E(w) = 0 because w@€KerE#. Thus w=0 and so ImH#nKer#H = {0}. These two 
conditions imply that V is the direct sum of the image and kernel of L. 


(iii) Let v€V and suppose v=u+w where u€ImE and we KerL. Note that H(u) =u 
by (i), and H(w) =0 because w € Ker #. Hence 
E(v) = E(ut+w) = E(u) + Hw) = ut+0 = u 


That is, H is the projection of V into its image. 


Suppose V = U@®W and suppose 7:V->V is linear. Show that U and W are both 
T-invariant if and only if TH = ET where E is the projection of V into U. 


Observe that E(v)€@U for every v€V, and that (i) E(v) =v iff vEU, (ii) E(v) =0 
hig eee UM 
Suppose ET =TE. Let we U. Since E(u) = u, 


T(u) = T(E(u)) = (TE)u) = (ET)(u) = E(T(u)) © U 
Hence U is T-invariant. Now let w€W. Since E(w) = 0, 
E(T(w)) = (ET)(w) = (TE)(w) = T(E(w)) = TO) = 0 andso T(w)eW 


Hence W is also T-invariant. 
Conversely, suppose U and Ware both T-invariant. Let v@V and suppose v=xu+w where 
uET and wEW. Then T(u)€U and T(w)€W; hence E(T(u)) = T(u) and H(T(w)) = 0. 


ae (ET)(v) = (ET)(utw) = (ET)(w) + (ET)(w) = E(T(w) + E(T(w)) = Tew 
and (TE)\(o) = (TE\u+w) = TEa+w)) = TW) 
That is, (ET)(v) = (TE)(v) for every v GV; therefore HT = TE as required. 


Supplementary Problems 


INVARIANT SUBSPACES 


10.36. 


10.37. 
10.38. 


10.39. 


10.40. 


10.41. 


10.42. 


Suppose W is invariant under T7:V-—>V. Show that W is invariant under f(T) for any polynomial 
f(t). 
Show that every subspace of V is invariant under J and 0, the identity and zero operators. 


Suppose W is invariant under S:V>V and T:V-V. Show that W is also invariant under 
S+T and ST. 


Let T:V-V be linear and let W be the eigenspace belonging to an eigenvalue \ of T. Show that 
W is T-invariant. 


Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any 
linear operator on V has an invariant subspace other than V or {0}. 


2 —4 
) viewed as a linear operator on (i) R2, (ii) C2. 


Determine the invariant subspaces of A = is 2 


Suppose dim V =n. Show that T:V-—>V _ has a triangular matrix representation if and only if 
there exist T-invariant subspaces W,CW,C---CW,—=V _ for which dimW, =k, k=1,...,n. 


INVARIANT DIRECT-SUMS 


10.43. 


10.44. 


10.45. 


The subspaces W,,...,W, are said to be independent if wy+-+:++w,=0, w;€ W;, implies that 
each w;=0. Show that L(W;) = W,@®--- @W, if and only if the W; are independent. (Here 
L(W;) denotes the linear span of the W;,,.) 


Show that V = W,@®--- @®W, if and only if (i) V=L(W, and (ii) W,L(W,,..-,Wx-1 
Wicaist We) 0) es eer: 


Show that L(W;) = W,@®-:: @W, if and only if dimL(W,) = dimW, + --- + dimW,. 


CHAP. 10] CANONICAL FORMS 245 


10.46. Suppose the characteristic polynomial of T:V>V is A(t) = f (6) fo(t)™2 ... f(t)" where the 
F(t) are distinct monic irreducible polynomials. Let V = W,@®-:: PW, be the primary decom- 
position of V into T-invariant subspaces. Show that f;(t)"\ is the characteristic polynomial of the 
restriction of T to W;. 


NILPOTENT OPERATORS 


10.47. Suppose S and T are nilpotent operators which commute, i.e. ST = TS. Show that S+T7 and ST 
are also nilpotent. 


10.48. Suppose A is a supertriangular matrix, i.e. all entries on and below the main diagonal are 0. Show 
that A is nilpotent. 


10.49. Let V be the vector space of polynomials of degree =n. Show that the differential operator on V 
is nilpotent of index n+1. 


10.50. Show that the following nilpotent matrices of order are similar: 


OLS er a0 Oj OD rine O20 
OU Ooh depres “0 D2 sOs0 lave ee 07 40 
5B G,BO:8 01 ORO ONO and eeu 0 
0 aL Hier, eter is Mal iat a N\ierunleen sie rds oe ema 
0 0 0 0 1540 


10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index 
of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4. 


JORDAN CANONICAL FORM 


10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t) 
and minimal polynomial m(t) are as follows: 
G) SAG) Mess 2ya(t — 3)7) a(t) = (b— 2)2(t— 3)" 
(ii) A(t) = (€—7)°, m(t) = (¢—7)2 
(iii) A(t) (Gaa2) eau (6)s = (i 12)3 
(iv) A(t) = (€—3)4(¢—5)4, m(t) = (¢— 8)?(¢— 5)? 


10.53. Show that every complex matrix is similar to its transpose. (Hint. Use Jordan canonical form and 
Problem 10.50.) 


10.54. Show that all complex matrices A of order n for which A” =T are similar. 


10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with 
only real entries. 


CYCLIC SUBSPACES 

10.56. Suppose 7:V-—>V is linear. Prove that Z(v, T) is the intersection of all T-invariant subspaces 
containing v. 

10.57. Let f(t) and g(t) be the T-annihilators of u and v respectively. Show that if f(t) and g(t) are rel- 
atively prime, then /(t)g(t) is the T-annihilator of u + v. 


10.58. Prove that Z(u,T) = Z(v,T) if and only if g(T)(u) =v where g(t) is relatively prime to the 
T-annihilator of u. 

10.59. Let W=Z(v,T), and suppose the 7-annihilator of v is f(t)” where f(t) is a monic irreducible Poly- 
nomial of degree d. Show that f(T)s(W) is a cyclic subspace generated by f(T)s(v) and it has dimen- 
sion d(n—s) if n >s and dimension 0if n=s. 


RATIONAL CANONICAL FORM 


10.60. Find all possible rational canonical forms for: 
(i) 6X6 matrices with minimum polynomial m(t) 
(ii) 6X 6 matrices with minimum polynomial m(t) 
(iii) 8 X 8 matrices with minimum polynomial m(¢) 


(t2 + 3)(¢ + 1)2 
(t+ 1) 
(t2 + 2)2(t + 3)2 


Ml 


\| 


10.61. Let A be a 4X4 matrix with minimum polynomial m(t) = (t2+1)(t2— 38). Find the rational ca- 


nonical form for A if A is a matrix over (i) the rational field Q, (ii) the real field R, (iii) the com- 
plex field C. 


246 CANONICAL FORMS [CHAP. 10 
Nese Oean) 
; Qe Ib 
10.62. Find the rational canonical form for the Jordan block Gh tear 
Om OF AO mex 
10.63. Prove that the characteristic polynomial of an operator 7:V—>V is a product of its elementary 
divisors. 
10.64. Prove that two 3 X 3 matrices with the same minimum and characteristic polynomials are similar. 
10.65. Let C(f(t)) denote the companion matrix to an arbitrary polynomial f(t). Show that f(t) is the char- 
acteristic polynomial of C(f(¢)). 
PROJECTIONS 
10.66. Suppose V=W,@::: @W,. Let E; denote the projection of V into W;. Prove: (i) H,H; = 0, 
See ON) MSY oe eae OR es! 
10.67. Let H,,...,H, be linear operators on V such that: (i) Ba E,, i.e. the H; are projections; 
(ii) HZ; =0, i149; (iii) T= H,+-+--+H,. Prove that V=ImE, ®:-:: © ImE,. 
10.68. Suppose E:V-—>V is a projection, ie. EH? = E. Prove that H has a matrix representation of the 
iI , 
form ie >) where 7 is the rank of # and I, is the r-square identity matrix. 
10.69. Prove that any two projections of the same rank are similar, (Hint. Use the result of Problem 
10.68.) 
10.70. Suppose H:V-—V isa projection. Prove: 


(i) I-—E is a projection and V = ImE# @ Im([— £); (ii) I+# is invertible if 1+1+0). 


QUOTIENT SPACES 


10.71. 


10.72. 


10.73. 


10.74. 


10.75. 


10.76. 


10.77. 


10.78. 


Let W be a subspace of V. Suppose the set of cosets {vy +W, vo+W,...,U,+W} in V/W is 
linearly independent. Show that the set of vectors {v,,Vo,...,U,} in V is also linearly independent. 


Let W be a subspace of V. Suppose the set of vectors {w,,Wo,...,U,} in V is linearly independent, 
and that L(u,)Q9 W = {0}. Show that the set of cosets {u,;+W,...,u,+W} in V/W is also 
linearly independent. 


Suppose V = U@W and that {w,,...,u,} is a basis of U. Show that {u,+W, ...,u,+W} is 
a basis of the quotient space V/W. (Observe that no condition is placed on the dimensionality of 
V or W.) 


Let W be the solution space of the linear equation 
Oj ai AoC et One es ane K 
and let v = (bj, bo, ...,b,) € K". Prove that the coset v-+ W of W in K” is the solution set of the 
linear equation 
Oi Ao Compe tn nye nO where 6 = a,b, +--+ a,b, 
Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible 
by ¢4, i.e. of the form apt*+ a,t°+ +++ +a,_,4t”. Show that the quotient space V/W is of dimension 4. 


Let U and W be subspaces of V such that WCUCYV. Note that any coset u+ W of W in U may 
also be viewed as a coset of W in V since w€U implies w€V; hence U/W is a subset of V/W. 
Prove that (i) U/W is a subspace of V/W, (ii) dim(V/W) — dim (U/W) = dim(V/U). 


Let U and W be subspaces of V. Show that the cosets of UMW in V can be obtained by inter- 
secting each of the cosets of U in V by each of the cosets of W in V: 


WAKO WAY = OSE) A) (Oar IS OLS WS: 


Let T:V-—V’ be linear with kernel W and image U. Show that V 

the quotient space V/W is isomorphic to U under the mapping T 
6:V/W->U defined by o(v+ W) = T(v). Furthermore, show that 7] 

T =1°6°n where »:V—>V/W is the natural mapping of V into 

V/W, ie. n(v) =v+W, and 7:UCYV’ is the inclusion mapping, V/Ww U Vv! 
i.e. 2(u) = u. (See diagram.) 6 a 


CANONICAL FORMS 247 


CHAP. 10] 


Answers to Supplementary Problems 


L((2, 1+ 2%) - 


(ii) C?, {0}, W, = L((2, 1—2%)), Wy 


(i) R? and {0} 


10.41. 


lio 
r—-t+-- 
(sae 10 | 
| 10 | 
ee eae 
Jr oo | 
ae 
ese ave Soa 
nol 
eS 
5 


[CHAP. 10 


: 


CANONICAL FORMS 


} 


248 


V1 
12 


0-3 
0 


1 


b 


W) =i 
ee 


0 —-3 
0 


1 


=a 
=e 


0-3 
Wes O 


053 
Le 0 


10.60. (i) | 


4n 


0 


Chapter 11 


Linear Functionals and the Dual Space 


INTRODUCTION 


In this chapter we study linear mappings from a vector space V into its field K of scalars. 
(Unless otherwise stated or implied, we view K as a vector space over itself.) Naturally 
all the theorems and results for arbitrary linear mappings on V hold for this special case. 
However, we treat these mappings separately because of their fundamental importance and 
because the special relationship of V to K gives rise to new notions and results which do not 
apply in the general case. 


LINEAR FUNCTIONALS AND THE DUAL SPACE 


Let V be a vector space over a field K. A mapping ¢:V-—>K is termed a linear func- 
tional (or linear form) if, for every u,v © V and every a,bEK, 


g(au+bv) = ad(u) + dd(v) 
In other words, a linear functional on V is a linear mapping from V into K. 
Example 11.1: Let 7,:K"—>K be the ith projection mapping, i.e. 7;(@1,@9,...,@,) = a. Then 7; 
is linear and so it is a linear functional on K”. 
Example 11.2: Let V be the vector space of polynomials in t over R. Let J:V—>R be the integral 
1 
operator defined by .J(p(t)) = f p(t) dt. Recall that J is linear; and hence it is 


a linear functional on V. 0 


Example 11.3: Let V be the vector space of n-square matrices over K. Let T:V—>K be the trace 


mapping 
TU(AN) = Gig SP GaP 2 2? artis where A = (a;,) 


That is, 7 assigns to a matrix A the sum of its diagonal elements. This map is 
linear (Problem 11.27) and so it is a linear functional on V. 


By Theorem 6.6, the set of linear functionals on a vector space V over a field K is also 
a vector space over K with addition and scalar multiplication defined by 
(p+o)(v) = o(v)+o(v) and (k¢)(v) = k9(v) 
where ¢ and o are linear functionals on V and k € K. This space is called the dual space of 
V and is denoted by V*. 


Let V = K™, the vector space of n-tuples which we write as column vectors. Then 


Example 11.4: 
the dual space V* can be identified with the space of row vectors. In particular, 


any linear functional ¢ = (a@;,...,@,) in V* has the representation 
v4 
v2 
P(E, «225 %n) = (Ay, Ay, .--,An)|. ” 
Xn 


249 


250 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 . 
or simply 
Pq) 2298p) = yy doh 4+ = Oe, 


Historically, the above formal expression was termed a linear form. 


DUAL BASIS 


Suppose V is a vector space of dimension n over K. By Theorem 6.7, the dimension of 
the dual space V* is also n (since K is of dimension 1 over itself.) In fact, each basis of V 
determines a basis of V* as follows: 


Theorem 11.1: Suppose {v1,...,Un} is a basis of V over K. Let ¢,,...,¢,€V* be the 
linear functionals defined by 


PO) = 220,00 
Then {¢,,..-,¢,} is a basis of V*. 


1 if i=j 
0 if ix¥j 


The above basis {¢,} is termed the basis dual to {vi} or the dual basis. The above for- 
mula which uses the Kronecker delta 8 is a short way of writing 


$1(Y,) = 1, $,(Y,) = 9, $,(¥,) =0, ..., $,(%,) = 0 
$(Y,) = 0, $,(¥) = 1, $,(¥,)=90, ..., o,(¥,) = 0 
$n.(Y,) = 9, $,(¥2) = 0, ..., $,(Y,-1) = 9, $,(¥,) = 1 


By Theorem 6.2, these linear mappings ¢, are unique and well defined. 


Example 11.5: Consider the following basis of R?: {v, = (2,1), vo = (8,1)}. Find the dual basis 
{$1 pa}- 
We seek linear functionals ¢,(x,y) = ax+ by and ¢(x,y) = cx +dy such that 


$1(V1) = 1, o1(Ve) = 0, G0(v1) = 0, go(v2) = 1 


Thus (V1) = $,(2,1) = 2a+b = 1 


\ ore) ai —1 161.3 
$1(¥2) = $:(8,1) = 8a +6 = 0 


$2(¥1) = ¢2(2,1) = 2e+d = 0 


Ors ie — 1 dd 2 
do(V2) = ¢2(8,1) = Be +d = 1 


Hence the dual basis is {¢;(x,y) = —x+8y, ¢o(x,y) = x — 2y}. 


The next theorems give relationships between bases and their duals. 


Theorem 11.2: Let {v1, ...,vn} be a basis of V and let {p,,---,,} be the dual basis of V*. 
Then for any vector ué V, 


u = $,(u)v, + ¢,(u)v, + ++: + 4,(u)r, 
and, for any linear functional o € V*, 


a ee, a(V,)$, iP o(V,), Pate tat, o(V,)>, 


Theorem 11.3: Let {v1,...,vn} and {w1,...,Wn} be bases of V and let {o,--->¢,} and 
{o,,...,0,} be the bases of V* dual to {vi} and {wi} respectively. Suppose 
‘P is the transition matrix from {v;} to {wi}. Then (P-')¢ is the transition 
matrix from {¢,} to {o,}. 


CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 251 


SECOND DUAL SPACE 


We repeat: every vector space V has a dual space V* which consists of all the linear 
functionals on V. Thus V* itself has a dual space V**, called the second dual of V, which 
consists of all the linear functionals on V*. 


We now show that each v © V determines a specific element v ey**. .Firstsot. all 
for any ¢€V* we define Be . 
v(o) = ¢$(v) 

It remains to be shown that this map 0 :V*—>K is linear. For any scalars a,b € K and 
any linear functionals $,o € V*, we have 


v(ap+bc) = (as +bo)(v) = ag(v) + bo(v) = ao(d) + bc) 
That is, v is linear and so v € V**, The following theorem applies. 


Theorem 11.4: If V has finite dimension, then the mapping v & » isan isomorphism of V 
ento,V**. 


The above mapping v+ v is called the natural mapping of V into V**. We emphasize 
that this mapping is never onto V** if V is not finite-dimensional. However, it is always - 
linear and, moreover, it is always one-to-one. 


Now suppose V does have finite dimension. By the above theorem the natural mapping 
determines an isomorphism between V and V**. Unless otherwise stated we shall identify 
V with V** by this mapping. Accordingly we shall view V as the space of linear functionals 
on V* and shall write V = V**. We remark that if {¢,} is the basis of V* dual to a basis 
{vi} of V, then {vi} is the basis of V = V** which is dual to {¢,}. 


ANNIHILATORS 


Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional 
¢ €V* is called an annihilator of W if ¢(w)=0 for every wEW, ie. if ¢(W) = {0}. 
We show that the set of all such mappings, denoted by W® and called the annihilator of W, 
is a subspace of V*. Clearly 0€ W°. Now suppose ¢,o€W®. Then, for any scalars 
a,b €K and for any wé€éW, 


(ap +bo)(w) = ag(w) + bo(w) = a0+ 10 = 0 


Thus a¢+ be € W® and so W® is a subspace of V*. 


In the case that W is a subspace of V, we have the following relationship between W and 
its annihilator W°. 


Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. Then 
(i) dimW + dimW® = dimV and (ii) We WW. 


Here W® = {vEV: ¢(v) = 0 forevery $ € W°} or, equivalently, W% = (W°)? where 
W is viewed as a subspace of V under the identification of V and V**. 


The concept of an annihilator enables us to give another interpretation of a homogeneous 


system of linear equations, 
1101 + Ait, +-:+ + Qintn = 0 


loi¥1 + Aeotts + +++ + Aantn = 0 («) 


einen le ulelereinaih 6 lollleine) fev a2'wiaevs, 16.10.02 0; ei 8 eh.0) (Oey 'e © 


Omiti + Gmote + vo: + Omnata =. 0 


252 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 


Here each row (dit, diz, ...,@in) of the coefficient matrix A = (aij) is viewed as an element 
of K" and each solution vector ¢$ = (#1, %2,...,%n) is viewed as an element of the dual space. 
In this context, the solution space S of (*) is the annihilator of the rows of A and hence of 
the row space of A. Consequently, using Theorem 11.5, we again obtain the following 
fundamental result on the dimension of the solution space of a homogeneous system of 
linear equations: 


dimS = dimK” — dim(row space of A) = mn — rank(A) 


TRANSPOSE OF A LINEAR MAPPING 


Let T:V>U bean arbitrary linear mapping from a vector space V into a vector space 
U. Now for any linear functional ¢ € U*, the composition $oT is a linear mapping from 
V into K: 


V U K 


That is, 6°07 © V*. Thus the correspondence 
pr oT 
is a mapping from U* into V*; we denote it by T‘ and call it the transpose of T. In other 


words, T':U*>V* is defined by 
T'(¢) = go 


Thus (T‘(¢))(v) = 6(T(v)) for every v € V. 


Theorem 11.6: The transpose mapping T' defined above is linear. 
Proof. For any scalars a,b © K and any linear functionals ¢,o € U*, 
T'(ag+boe) = (afdt+bo)oT = a(doT) + D(coT) 
== TO Dee a) 
That is, T* is linear as claimed. 


We emphasize that if T is a linear mapping from V into U, then T* is a linear mapping 
from U* into V*: : = 
VU V*<— U* 
The name “transpose” for the mapping 7* no doubt derives from the following theorem. 
Theorem 11.7: Let T7:V->U be linear, and let A be the matrix representation of T rel- 
ative to bases {vi} of V and {ui} of U. Then the transpose matrix A‘ is 


the matrix representation of 7J':U*—>V* relative to the bases dual to 
{ui} and {vi}. 


CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 253 


Solved Problems 


DUAL SPACES AND BASES 


14.1, 


11.3. 


Let ¢:R?>R and o:R?>R be the linear functionals defined by (x,y) = x +2y 
and o(z,y) = 8x—y. Find (i) 6+, (ii) 4¢, (iii) 26 — 5e. 

(Gi) (¢ + o)(a,y) = g(a,y) + o(”,y) = a+ 2y+38e—y = 4ety 

(ii) (49)(w,y) = 44(a,y) = 4(a+2y) = 4a + 8y 

(ili) (26—5e)(w,y) = 2¢(x,y) —5 o(%,y) = 2(e%+2y) — 5(8a—y) = —18x + 9y 


Consider the following basis of R®: {v1 = (1,—1,3), ve = (0,1,—1), v3 = (0,3, —2)}. 
Find the dual basis {¢,, ¢,, bs}: 


We seek linear functionals 


$1 (%, Y, 2) = A,X ae aay at A32, $o(x, Y; 2) = byx a boy ar bsz, $3(x, Y, 2) = €1% a Coy Te C3% 


such that $1(¥;) = 1 $1(V2) = 0 $1(V3) = 0 
$2(V4) = 0 $2(V2) =1 $o(V3) 0 
$3(v;) = 0 ¢3(V2) = 0 $3(vs)-= 1 


We find ¢, as follows: 


1(V}4) = ¢,(1, =i, 8) Sirs A oP 3a Sil 
$1(Vo) = $,(0,1,—-1) = ag— a3 = 0 
$1(V3) = $4(0,3,—-2) = 3a, — 2a3 = 0 


Solving the system of equations, we obtain a; =1, ag =0, ag =0. Thus ¢,(x,y,z) = a. 


We next find ¢o: 
$2(V;) = ¢9(1,—1,3) = 6; — by + 8b, = 0 


$2(%2) = $2(0,1,—1) = se ate | 
$2(V3) = ¢$2(0,3,—-2) = 3b, — 2b, = 0 
Solving the system, we obtain 6, = 7, b, = —2, bs; = —8. Hence ¢.(x,y,2) = Tx — 2y — 82. 


Finally, we find ¢3: 
$3(¥1) = $3(1,—1,3) = ¢, — eg + 8c3 = 0 


p3(V2) = $3(0,1,—1) = Co 6Cz = 0 
$3(V3) = $3(0,38,—-2) = 3¢, — 2e3 = 1 
Solving the system, we obtain c, = —2, c,=1, c,;=1. Thus ¢,(%,y,z) = —2at+ytz. 


Let V be the vector space of polynomials over R of degree =1, ie. V = 
{a+bt: a,b ER}. Let ¢,:V>R and $,:V>R be defined by 


6(()) = [Mod and g,(ftt) = f° feat 


(We remark that ¢, and ¢, are linear and so belong to the dual space V*.) Find the 
basis {v1, v2} of V which is dual to {¢,, ¢,}. 
Let v; =a+bt and v,=c+dt. By definition of the dual basis, 
$1(¥1) =1, o(v1) = 0 and $1(Vo) = 0, $o(v2) = 1 


Thus 
o1(%1) = a+ $b = 1 
or eS BEDS Se 


| 
b&b S& P 
Sy 
— 
a 
Ra 
Q 
oH 
l| 


$o(01) = fj (a+ bt)dt = 2a+ 2b = 0 
0 


254 


11.4. 


11.5. 


LINEAR FUNCTIONALS AND THE DUAL SPACE 


1 
$1(V2) = f (c+ dt) dt e+id = 0 


0 
2 

$2(Vo) if (c + dt) Cen AO a 2d =e lt 
0 


In other words, {2 — 2t, —} + t} is the basis of V which is dual to {44, go}. 


Prove Theorem 11.1: Suppose {v1,...,Un} is a basis of V over K. Let ¢,,.. 


be the linear functionals defined by 


eit ers 1 #w=7 
aa ? 0 if i4+j 


PhenX{¢,).. +O.) isva Dasis of V~. 


We first show that {¢,,...,¢,} spans V*. Let ¢ be an arbitrary element of V*, and suppose 


#(V1) = ky, o(Vq) = ka, «.-, $(Un) = kn 
Set o = ki¢, +--+: + k,¢,. Then 
o(V;) = (kypy + +++ + kn bn) (04) 
= ky oy(v1) + ky Golvy) + +++ + Kn bn(ry) 
= [hp Osk ap OW Som se OW Se iy 
Similarly tors 7 — 2,525.10 
a(vj) = (kygy + +++ + hn gn)(V4) 
= ky div) + <=> kyo) + re Ka Sale) = 


k; 


v 


Thus ¢(v;) = o(v, for i = 1,...,n. Since ¢ and o agree on the basis vectors, 


kyo; + --> + k,¢,. Accordingly, {¢,,...,¢,} spans V*. 
It remains to be shown that {¢,, ...,¢,} is linearly independent. Suppose 
M191 + Agdg + °°" + Andn = O 
Applying both sides to v,, we obtain 
0 = O(vy) = (A461 + ° ++ + Gndn) (0) 

= Hy 4(¥y) + Ay h9(¥y) + +--+ + Gy dn(V) 

= eg ST Seda Oe ee saa, oO 5 ae 
Siimilathyaato rae —elerearies 


0 


II 


0(v;) = (A114 a siete Andn) (Vj) 


= 04 6,(0;) 27s a a; 6(v;) oe" a, 6.) =a, 


o 


[CHAP. 11 


»o,6 VE 


That is, a, =0,...,a, =—0. Hence {¢,,...,¢,} is linearly independent and so it is a basis of V*. 


Prove Theorem 11.2: Let {v1,..., Vn} be a basis of V and let {¢,, .. 


basis of V*. Then, for any vector wE V, 
u = ¢$,(u)v, + o,(ujv, +--+ + ¢,(u)v, 
and, for any linear functional o € V*, 


Go= o(V,)¢, la o(V,)o, Sp 02 0 ae o(v,)¢,, 


Suppose U = Opa O5Uy as nO ey, 
Then 
$1(U) = G1 G4(Vy) + Gg $y(VQ) + **+ + Ay Gy(V,) = ay21 + ag? 0+ °--: 


-»,} be the dual 


ay 


(1) 


(2) 
(3) 


CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 255 


11.6. 


11.7. 


is. 


Similarly, for i= 2,...,n, 
Pi(U) = ay G(V1) + +++ + ap giv) + +++ + ang(vn) = 4; 
That is, ¢,(u) = a1, ¢o(u) = do, ..., ¢n(u) =a, Substituting these results into (3), we obtain (1). 


Next we prove (2). Applying the linear functional o to both sides of (1), 
au) = $4(u) o(vy) + go(u) o(vg) + +++ + dnl) o(vp) 
o(V1) b1(u) + o(V9) do(u) + +++ + o(vp) dn(2) 
(0(0 3) $1 + o(Va)$9 + +++ + o(Vy)bn)(W) 
Since the above holds for every u€V, o = o(v1)¢, + o(vo)¢o + °°? + a(U»)¢, as claimed. 


Prove Theorem 11.3: Let {v1,...,vn} and {wi,...,Wn} be bases of V and let 
{$,,--.,6,} and {o,,...,¢,} be the bases of V* dual to {v;} and {wi} respectively. 
Suppose P is the transition matrix from {vi} to {wi}. Then (P~)' is the transition 
matrix from {¢,} to {o,}. 


Suppose 
Wy = AyyVy + MyQVQ + t+ + AqnUy Oy = bh TF Osage +" Ota da 
Wg = Go1Vy a Ag0Vo qo 00 Se AonVn Cp boo ta boodo a OO Se Donon 
Wy = AniVy T Ange + °°* + AnnVy On = Onidy + Onoge + oe + Onen 


where P = (a;;) and Q = (b;;). We seek to prove that Q = (P~1)t. 
Let &; denote the ith row of Q and let C; denote the jth column of Pt. Then 
Rb 045 Dip ee Die and: Cy (ayy, Gene. =, Ogad™ 
By definition of the dual basis, 
o,(W;) = (bi1¢1 + Digda + +++ + Dingn)(Gj1V1 + AjgVg + °** + Aj_Vq) 
= di051 + Dindjg + +++ + Dindin = RC] = 85; 


where $;; is the Kronecker delta. Thus 


Ri@p wR C3 PACe 0 0 
QPt = RC, RC, IAC. = 1 — I 
REC aeRO, RG. 0 0 1 


and hence Q = (P‘)-1= (P~—!)t as claimed. 


Suppose V has finite dimension. Show that if v€V,v#0, then there exists 
¢ € V* such that ¢(v) #0. 


We extend {v} to a basis {v,vo,...,U,} of V. By Theorem 6.1, there exists a unique linear 
mapping ¢:V—>K_ such that ¢(v)=1 and ¢(v,) =0,7=2,...,n. Hence ¢ has the desired 
property. 


Prove Theorem 11.4: If V has finite dimension, then the mapping v bv is an 
oS . A 
isomorphism of V onto V**. (Here 0:V*->K is defined by v(¢) = ¢$(v).) 
We first prove that the map v P is linear, i.e. for any vectors v,w@€@V and any scalars 
a,b€&K, dv+bw = av + bw. For any linear functional ¢€ V%*, 
G+ bw(s) = g(av+bw) = ag(v) + b¢(w) 
= a0(4) + bW(g) = (ad + bw)(4) 


Since a + bw (4) = (av + bw)(¢4) for every ¢€V*, we have ao + bw = av+ bw. Thus the 


A * 
map v/v is linear. 


256 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 
Now suppose ue EV, v+#0. Then, by the preceding problem, there exists ¢€V* for which 
¢(v) #0. Hence ° (¢) = ¢(v) #0 and thus 040. Since v #0 implies D# 0, the map vb o 
is nonsingular and hence an isomorphism (Theorem 6.5). - 
Now dimV = dim V* = dim V** because V has finite dimension. Accordingly, mapping vv 
is an isomorphism of V onto V**. 
ANNIHILATORS 
11.9. Show that if ¢€V* annihilates a subset S of V, then ¢ annihilates the linear span 
L(S) of S. Hence S° = (L(S))°. 
Suppose v € L(S). Then there exist w,,...,w,€S for which v = ayw, + GgWg + *** + 4,W,. 
gv) = ay glwy) + ay glws) + +++ + a, g(w,) = 4,0 + 4,0 + +> + 4,0 = 0 
Since v was an arbitrary element of L(S), ¢ annihilates L(S) as claimed. 
11.10. Let W be the subspace of R* spanned by wv: = (1,2,—3,4) and v2=(0,1,4,—1). Find 
a basis of the annihilator of W. 
By the preceding problem, it suffices to find a basis of the set of linear functionals ¢(%,y,z,w) = 
ax + by +cz+dw for which ¢(vy)=0 and ¢(v9) = 0: 
¢(1,2,—8,4) = a+ 2b—38¢e+ 4d = 0 
Owe bi Ac a0 
The system of equations in unknowns a, b,c,d is in echelon form with free variables ¢ and d. 
Set c=1,d=0 to obtain the solution a=11,b=—4,c=1,d=0 and hence the linear func- 
tional ¢,(x,y,z2,w) = lla — 4y +2. 
Set ¢=0,d=-—1 to obtain the solution a=6, b=—1,c=0,d=-—1 and hence the linear func- 
tional ¢.(x,y,z2,w) = 6%«& —y— w. 
The set of linear functionals {¢,,¢} is a basis of W®, the annihilator of W. 
11.11. Show that: (i) for any subset S of V, SCS; (ii) if SiGS2, then S$CS}. 
(i) Let v€S. Then for every linear functional ¢€ S°, % (4) = ¢(v) = 0. Hence vE (S°)9, 
Therefore, under the identification of V and V**, v €S. Accordingly, S cS. 
(ii) Let gE SH Then ¢(v) =0 for every v€ Spy. But S;,CS.5; hence ¢ annihilates every ele- 
ment of S;, i.e. ¢€S,. Therefore So E S?. 
11.12. Prove Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. 


Then (i) dimW +dimW® = dimV and (ii) W°=W. 


(i) Suppose dimV=n and dimW=r=n. We want to show that dim W9 =n—~r. We choose 
a basis {w,,...,w,} of W and extend it to the following basis of V: {w,..., Wy, V4) +++) Un—r}- 
Consider the dual basis 

{1, sey Pry 1, «ey Ora ny 
By definition of the dual basis, each of the above o’s annihilates each w,; hence 
01, -+++)0n,—, © W°. We claim that {o;} is a basis of W°. Now {o;} is part of a basis of V* and 
so it is linearly independent. 
We next show that {o;} spans W®°. Let o€ W®. By Theorem 11.2, 


o = o(W1)oy + +2* + o(Wz)b, + o(Vy)oy + +++ + O(Up—pOn—y 
=. Ody Fie h OS poy )oy Ft oni epae 
= (6(Uj)oy hss tre(Oper pon se 
Thus {o1,...,¢,—,} spans W° and so it is a basis of W°. Accordingly, dim W° = n—r = 


dim V — dimW as required. 


(ii) Suppose dimV=n and dimW=r. Then dimV* =n and, by (i), dimW°=n-—r. Thus 
by (i), dimW%=n—(n—r)=r; therefore dim W = dimW, By the preceding problem, 
WcW%, Accordingly, W = W, 


CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 257 


11.13. Let U and W be subspaces of V. Prove: (OW) yw 
- Let ¢€(U+W)°. Then ¢ annihilates U+ W and so, in particular, ¢ annihilates U and V. 
hat is, ¢€ U® and ¢€ W?; hence ¢2U°NW*. Thus (U+ W)cU°n we, 


On ae other hand, suppose « € U0 W°. Then o annihilates U and also W. If veU+W 
then is ut w where ueGU and wEW. Hence o(v) = o(u) + o(w) = 0+0=0. Thus 
annihilates U+ W, ie. o © (U+ W)?. Accordingly, U°+ WC (U+W). 


Both inclusion relations give us the desired equality. 


Remark: Observe that no dimension argument is employed in the proof; hence the result holds 
for spaces of finite or infinite dimension. 


TRANSPOSE OF A LINEAR MAPPING 


11.14. Let ¢ be the linear functional on R2 defined by ¢(”, y) = «—2y. For each of the 
following linear operators T on R%, find (T*(¢))(a, y): (LD (sy) = 0) GST are 
(y,x+y); (iii) T(x, y) = (24 —3y, 5a + 2y). 

By definition of the transpose mapping, T*t(¢)=¢0°T, i.e. (T'(¢))(v) = ¢(T(v)) for every 
vector v. Hence 
(i) (T%$))(@, y) = o(T(a, y)) = g(x, 0) = x 
Gi) (T(¢))(@, y) = g(T@,¥)) = oy,e+y) = y-2et+y) = -Qe-y 
(iii) (T*(g)) (a, y) g(T(x, y)) = g(2% —8y, 5u+2y) = (2% — 8y) — 2(5a+2y) = —8a — Ty. 


11.15. Let T:V->U be linear and let JT‘: U* > V* be its transpose. Show that the kernel 
~ of T‘ is the annihilator of the image of T, i.e. Ker 7* = (Im7T)?. 


Suppose ¢€Ker/7*; that is, Tt(¢)=¢°T=0. If weEIm7T, then u=T(v) for some 


v €V; hence 
o(u) = g(T(v)) = (g°T)(v) = Ov) = 0 


We have that ¢(u) =0 for every u€ImZT; hence ¢€(ImT)®°. Thus Ker Ttc (Im T)°. 
On the other hand, suppose o € (Im T)°; that is, o(ImT) = {0}. Then, for every vEV, 
(T(o))(v) = («°T)(v) = o(T(v)) = 0 = 0) 
We have that (T*(c))(v) = 0(v) for every v EV; hence Tt(c) = 0. Therefore o€ KerT*t and so 
(Im T)° c Ker T*. 
Both inclusion relations give us the required equality. 


11.16. Suppose V and U have finite dimension and suppose T:V—>U is linear. Prove: 
rank (JT) = rank (T"). 
Suppose dimV =n and dimU =m. Also suppose rank (T) =r. Then, by Theorem 11.5, 
din (ner?) = dim U —-dim (im?) =. an. rank (1) = on 
By the preceding problem, Ker Tt = (Im T)°. Hence nullity (T') =m—r, It then follows that, as 


claimed, 
rank (Tt) = dim U* — nullity(T!)-= m—(m—-r) = r= rank (T) 


11.17. Prove Theorem 11.7: Let 7:V-—>U be linear and let A be the matrix representation 
of T relative to bases {v1,...,Um} of V and {t,...,Un} of U. Then the transpose 
matrix At is the matrix representation of 7‘: U*->V* relative to the bases dual to 
{ui} and {v5}. 

Suppose T(v3) = yyy + Aygtg + orre F Ayn Uy 
T(Vo) = Agyty + Agotg + +7 + Gann (1) 


Rigen ediel pcReie eneiieieie)\e)-0.emels: 0) 01.0 -10.0:18 (0, 6)"0)-6: (0) 8) (01/0) ote caine 


258 


11.18. 


LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 


We want to prove that 
Tt(o4) = Gy1¢4 + Gq1h9 + °° + Omidm 


T*(o2) = Qy261 + Ago¢g + °°" + Ameen (2) 


66 06 0 0 ae oe 0 0 0 6 0 © @ 616) 01. @) b 0. 6s0\s20110) 416 9/0) (0) 19) 9 


T'(on) = Qn + Gongg + *°° + Amnbm 


where {o;} and {¢;} are the bases dual to {w;} and {vj} respectively. — 


Let v€@V and suppose v = k,v, + kovo + +°+ + kp¥m. Then, by (2), 


T(v) = ky T(vy) + ky T(vg) + +++ + Km Tm) 

= Key (ayyuy +++ +H y_ty) + Kg(agyty + +++ + dyna) F oo2 + + Kn (Gmits ++ ** + nnn) 
Pap (kyay1 aie KgQyy Sacer Kem G@m1)U4 S ed as (kyG1n ay KgQon Tee ae Km Omn)Un 

= 2 (Kya; + Kgdg, + +++ + kn Gi) 

i= 
Hencetfor 1g) le aan, ‘ 
n 
(Pople) = ofa) = oy B Cet t hreart ° + Kota) 
= ky; + kyo; + «> + Kin Qmj (3) 


Onithe other hand, for 3.— 1,2. ...7, 
(41;61 + Agj¢9 + °° + Omjom)(v) = (4361 + Gojdg t+ 00+ + Omjhm) (kyr 1 + hove + +++ + kp %m) 
= kydy; + hota; + +++ + hn Gin; (4) 
Since v © V was arbitrary, (3) and (4) imply that 
T'(o;) = yj¢1 + Agjidg + °°+ + Anjom, a emer nae 
which is (2). Thus the theorem is proved. 


Let A be an arbitrary m X n matrix over a field K. Prove that the row rank and the 
column rank of A are equal. 


Let T:K"—> K™ be the linear map defined by T(v) = Av, where the elements of K” and K™ 
are written as column vectors. Then A is the matrix representation of T relative to the usual bases 
of K”" and K™, and the image of T is the column space of A. Hence 


rank (T) = column rank of A 
By Theorem 11.7, At is the matrix representation of Tt relative to the dual bases. Hence 
rank (Tt) = column rank of At = row rank of A 


But by Problem 11.16, rank(T) = rank(T‘); hence the row rank and the column rank of A are 
equal. (This result was stated earlier as Theorem 5.9, page 90, and was proved in a direct way 
in Problem 5.21.) 


Supplementary Problems 


DUAL SPACES AND DUAL BASES 


11.19, 


11.20. 


11.21. 


Let ¢:R?>R and o:R*>R be the linear functionals defined by ¢(x,y,z) = 2a—8y+z2 and 
o(”, y; 2) = 4a —2y+3z. Find (i) ¢+a, (ii) 84, (iii) 26 —5e. 


Let ¢ be the linear functional on R? defined by ¢(2,1)=15 and ¢(1,—2) = —10. Find g(x,y) and, 
in particular, find ¢(—2, 7). 


Find the dual basis of each of the following bases of R3: 
(i) {(1, 0, 0), (0, aly 0), (0, 0, Dy; (ii) cae 2; 3), (J, aly 1), (2, —4, 7)}. 


CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 259 


11.22. Let V be the vector space of polynomials over R of d = 
egree = 2. Let 4, i 
functionals on V defined by ; SPA NEN Crise a 


1 
gi(f(t)) = f F(t)dt, golf(t)) = #1), ga(f(t)) = F(0) 
0 


11.23. Suppose u,v €V and that ¢(u) =0 implies ¢(v)=0 for all ¢€V*, Show that v=ku for 
some scalar k. 


11.24. Suppose ¢,o€ V* and that ¢(v) =0 implies o(v) =0 for all v€V. Show that o=k¢ for 
some scalar k. 


11.25. Let V be the vector space of polynomials over K. For a€ K, define ¢,:V>K by ¢,(f(t)) = f(a). 
Show that: (i) ¢, is linear; (ii) if a #~b, then ¢, ~ $y. 


11.26. Let V be the vector space of polynomials of degree =2. Let a,b,e@K _ be distinct scalars. Let 
fa, ?p and ¢, be the linear functionals defined by ¢q(f(t)) = f(a), do (f(t)) = f(b), ¢-(f(®) = fo). Show 
that {¢a, 4p, ¢-} is linearly independent, and find the basis {f,(t), fo(t), f3(t)} of V which is its dual. 


11.27. Let V be the vector Space of square matrices of order n. Let T:V—>K _ be the trace mapping: 
T(A) = a4; + @g9+ +++ +4,,, where A = (a;;). Show that T is linear. 


11.28. Let W be a subspace of V. For any linear functional ¢ on W, show that there is a linear functional 
o on V such that o(w) = ¢(w) for any w€ W, i.e. ¢ is the restriction of o to W. 


11.29. Let {e,,...,e,} be the usual basis of K"™. Show that the dual basis is {7,,...,7,} where 7; is the 
ith projection mapping: 7;(a@,,...,@,) = 


11.30. Let V be a vector space over R. Let ¢;,¢2€ V* and suppose o:V—>R defined by o(v) = 4,(v) d9(v) 
also belongs to V*. Show that either ¢,=0 or ¢.=0. 


ANNIHILATORS 

11.31. Let W be the subspace of R* spanned by (1, 2,—3,4), (1,8, —2,6) and (1,4,—1,8). Find a basis of 
the annihilator of W. 

11.32. Let W be the subspace of R? spanned by (1,1,0) and (0,1,1). Find a basis of the annihilator of W. 

11.33. Show that, for any subset S of V, L(S) = S° where L(S) is the linear span of S. 


11.34. Let U and W be subspaces of a vector space V of finite dimension. Prove: (UN W)® = U® + W®. 


11.35. Suppose V = U@W. Prove that V* = U°® W°. 


TRANSPOSE OF A LINEAR MAPPING 

11.36. Let ¢ be the linear functional on R? defined by (x,y) = 8a—2y. For each linear mapping 
T:R3—> R2, find (T*(¢))(x, y, 2): 
(i) T(x, y,2) = (@@t+y,y+2); (ii) Tie, y, ee %, 2 — ay) 


11.37. Suppose S:U>V and T:V~—W are linear. Prove that (T°S)* = StoTt. 
11.38. Suppose 7:V-—>U is linear and V has finite dimension. Prove that Im 7‘ = (Ker T)°. 


Suppose 7:V-—U is linear and u€ U. Prove that w€ImT or there exists ¢@V* such that 
Tt(¢) =0 and ¢(u) = 1. 


11.39. 


Let V be of finite dimension. Show that the mapping Tb Tt is an isomorphism from Hom (V, V) 


11.40. 
onto Hom (V*, V*). (Here T is any linear operator on V.) 


260 


LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 


MISCELLANEOUS PROBLEMS 


11.41. 


11.42. 


11.19. 


11.20. 


11.21. 


11.25. 


11.26. 


11.31. 


11.32. 


11.36. 


Let V be a vector space over R. The line segment uw joining points u,v€V_ is defined by 
uw = {tu+ (1—t)v: 0=t=1}. A subset S of V is termed convex if u,vES implies wocs. 


Let ¢€ V* and let 
Wt = WeEV: ¢g(v) > 0}, W = WEV: ov) =0}, W- = {fv EV: (v) < 0} 


Prove that Wt, W and W~ are convex. 


Let V be a vector space of finite dimension. A hyperplane H of V is defined to be the kernel of a 
nonzero linear functional ¢ on V. Show that every subspace of V is the intersection of a finite 
number of hyperplanes. 


Answers to Supplementary Problems 
G) 6a = dy 42, (1) 26% — Sy 32, Gil), los 4 49) — se 
o(%, Y), = 4a + Ty, o(—2, 7%) = 41 


(i) {$(x, Y; 2) = x, p(x, Y, 2) =Y, $3(x, Y, 2) = 2} 
ii) {$1(%, y,z) = —3a — by — 22, go(w,y,2) = 2a + y, $3(u, Yy,z) = w+ 2y + zh 


(ii) Let f(t) =t. Then ¢,(f(t)) =a # b = ¢)(f(t)), and therefore ¢, ~ yp. 


= — (b+ ete _ P—(atocjt + ac _ @—(at dt + ab 
{Ato = By oy) aay eee teat \ 


{4(x, Y,%, t) = 5a — Wome po(x, Y;%, t) a 2y = t} 
{g(u,y,2) =" —y +t a} 


Chapter 12 


Bilinear, Quadratic and Hermitian Forms 


BILINEAR FORMS 

Let V be a vector space of finite dimension over a field K. A bilinear form on V is a 
mapping f:VxV->K which satisfies 

(i) f(au1+ bus, v) = af(us, v) + bf (ue, v) 

(ii) f(u, avit+ bv2) = af(u, v1) + bf(u, v2) 
for all a,b © K and all u,v; V. We express condition (i) by saying f is linear in the 
first variable, and condition (ii) by saying f is linear in the second variable. 


Example 12.1: Let ¢ and o be arbitrary linear functionals on V. Let f:VxXV-kEK be defined by 
f(u, v) = ¢(u) o(v). Then f is bilinear because ¢ and o are each linear. (Such a 
bilinear form f turns out to be the “tensor product” of ¢ and o and so is sometimes 
written f= ¢® o.) 

Example 12.2: Let f be the dot product on R”; that is, 

(RD, OD EO G) GI Se Gy a 1 SI) 


where u=(a;) and v =(6,;). Then f is a bilinear form on R*. 


Example 12.3: Let A = (a;;) be any n Xn matrix over K. Then A may be viewed as a bilinear 
form f on K” by defining 


G11 A192 Gin Y1 

f(X,Y) = XtAY = (ey, m9, ...,%q) | “2! a Yo 
4 Gni G2 +--+ Ann Yn 

= Ps UjCYj; = My 1Yy + yp%yYo + °° + OnnenYn 


The above formal expression in variables x;,y; is termed the bilinear polynomial 
corresponding to the matrix A. Formula (1) below shows that, in a certain sense, 
every bilinear form is of this type. 
We will let B(V) denote the set of bilinear forms on V. A vector space structure is 
placed on B(V) by defining f+ g and kf by: 
(f+g9)(u,v) = flu,v) + g(u,v) 
(kf)(u,v) = kf(u,v) 

for any f,g © B(V) and any k EK. In fact, 

Theorem 12.1: Let V be a vector space of dimension n over K. Let Liha as »o,} be a 
basis of the dual space V*. Then {fy:7,7=1,...,n} is a basis of B(V) 
where fi is defined by fi(u,v) = $,(u)¢,(v). Thus, in particular, 
dim B(V) = n’. 


261 


262 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


BILINEAR FORMS AND MATRICES 


Let f be a bilinear form on V, and let {e1,...,én} be a basis of V. Suppose u,v © V 
and suppose 


U = der °°: + nen, 0 = 01e1 +>: > +, Onen 
Then 
f(u,v) = f(arert +++ +Qnén, bid: + >: + + Dnén) | 
= a1bif (é1, €1) ae dibof(e1, €2) Apu oO Fp AnDnf (En, €n) = ph aib;f (i, é;) 
13) = 


Thus f is completely determined by the n? values f(éi, é;). 


The matrix A = (aij) where aij = f(ei,e;) is called the matrix representation of f rel- 
ative to the basis {e:} or, simply, the matrix of f in {ei}. It “represents” f in the sense that 


(eae =e > ab tere) C=) Or a ea = “(ul Alvle (1) 


for all u,v € V. (As usual, [w]e denotes the coordinate (column) vector of «€V _ in the 
basis {e;}.) 


We next ask, how does a matrix representing a bilinear form transform when a new 
basis is selected? The answer is given in the following theorem. (Recall Theorem 7.4 that 
the transition matrix P from one basis {e;} to another {e/} has the property that [wu]. = Plule 
for every u& V.) 


Theorem 12.2: Let P be the transition matrix from one basis to another. If A is the 
matrix of f in the original basis, then 


Be= Poa 
is the matrix of f in the new basis. 


The above theorem motivates the following definition. 


Definition: A matrix B is said to be congruent to a matrix A if there exists an invertible 
(or: nonsingular) matrix P such that B= P‘'AP. 


Thus by the above theorem matrices representing the same bilinear form are congruent. 
We remark that congruent matrices have the same rank because P and Pt are nonsingular; 
hence the following definition is well defined. 


Definition: The rank of a bilinear form f on V, written rank (f), is defined to be the rank 
of any matrix representation. We say that f is degenerate or nondegenerate 
according as to whether rank(f)<dimV or rank(f) =dimV. 


ALTERNATING BILINEAR FORMS 
A bilinear form f on V is said to be alternating if 
GQ) f(0;@) = ,0 
for every v EV. If f is alternating, then 
0 = flutv,utv) = flu,u) + flu, v) + flv, uw) + fv, v) 


and so (ii) f(u,v) = —f(v, u) 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 263 


for Sern US Va bilinear form which satisfies condition (ii) is said to be skew sym- 

ee ee men If 1+1+0 in K, then condition (ii) implies f(v, v) = — f(v, v) 
ch implies condition (i aes Ss 

Nea aes ey o ition (i). In other words, alternating and skew symmetric are equivalent 


The main structure theorem of alternating bilinear forms follows. 


Theorem 12.3: Let f be an alternating bilinear form on V. Then there exists a basis of 
V in which f is represented by a matrix of the form 


Moreover, the number of oe a is uniquely determined by f (because 
it is equal to 4 rank (f)). 


In particular, the above theorem shows that an alternating bilinear form must have 
even rank. 


SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS 
A bilinear form f on V is said to be symmetric if 
f(u,v) = flv, u) 
for every u,v © V. If A is a matrix representation of f, we can write 
FI ato AY XAG Ae 
(We use the fact that X‘AY is a scalar and therefore equals its transpose.) Thus if f is 
symmetric, 
VAN = XY) =f = VAX 
and since this is true for all vectors X, Y it follows that A =A‘ or A is symmetric. Con- 
versely if A is symmetric, then f is symmetric. 


The main result for symmetric bilinear forms is given in 


Theorem 12.4: Let f be a symmetric bilinear form on V over K (in which 1+1+0). 
Then V has a basis {v1,..., Un} in which f is represented by a diagonal 


matrix, i.e. f(vi,vj)=0 for 147. 


Alternate Form of Theorem 12.4: Let A beasymmetric matrix over K (in which 1+ 1 0). 
Then there exists an invertible (or: nonsingular) matrix P such that P‘AP 
is diagonal. That is, A is congruent to a diagonal matrix. 


264 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


Since an invertible matrix P is a product of elementary matrices (Problem 3.36), one way 
of obtaining the diagonal form P‘AP is by a sequence of elementary row operations and 
the same sequence of elementary column operations. These same elementary row opera- 
tions on J will yield P‘. This method is illustrated in the next example. 


EE 4 Sex 
Example 12.4: Let A = 2 5 —4 |, a symmetric matrix. It is convenient to form the block 
SAS 
matrix (A, J): 
2 Sie ek a 0 
(A; )—= BR oe Ole) aea0 
== 97 Aan Sas iOpen Ohnamelt 


We apply the operations R, > —2R,+R, and R3;>3R,+R3 to (A, J), and then 
the corresponding operations C,> —2C,+C, and C3> 38C,+C3 to A to obtain 


1572) 3-1) oe 6 110: OL ORG 
Oi 42 62 (39 4 Od wd then = |e ibe joi 
Oe 2A A BY: (Nie oa ere io ona 


We next apply the operation R3,—-—2R,+ Rs and then the corresponding operation 
C3 > —2C,+C3 to obtain 


1 Oa On eae ae are diab 0 pal tO O 
Q che 2) 2:5 Aes Oil vound Chen's, |s0iarl Oi e pelle 
OR Om— on G =e i 0 =e. 2 1 
Now A has been diagonalized. We set 
i Se t( al 0 0 
[P= 0 1-2 andthen P*AP = One ar 0 
Oe nOnar yt 0) 04=5 


Definition: A mapping q:V->K is called a quadratic form if q(v)=f(v,v) for some 
symmetric bilinear form f on V. 


We call gq the quadratic form associated with the symmetric bilinear form f. If 
1+1+0 in kK, then f is obtainable from q according to the identity 


f(u,v) = s(a(ut v) — a(u) — a(v)) 
The above formula is called the polar form of f. 


Now if f is represented by a symmetric matrix A = (aj), then q is represented in the 
form 


G11 Qiz Qin 1 
Q(X)? SO fREX) SS XA RS ae ene ee 
Ani Anz Ann ee 
= Daye; = auxt + awe t+ +++ + Aunty + 2 2, ayia 
13) v 


The above formal expression in variables x is termed the quadratic polynomial correspond- 
ing to the symmetric matrix A. Observe that if the matrix A is diagonal, then q has the 
diagonal representation 

Q(X) = Xt#*AX = ayxi + aeoe2 + +++ + Guat? 
that is, the quadratic polynomial representing q will contain no “cross product” terms. By 
Theorem 12.4, every quadratic form has such a representation (when 1+1+0). 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 265 


Example 12.5: Consider the following quadratic form on R?: 
q(u,y) = 2x2 — 12xy + 5y?2 
One way of diagonalizing q is by the method known as “completing the square” 
which is fully described in Problem 12.35. In this case, we make the substitution 


*=s+3t, y=t to obtain the diagonal form 


q(x, y) = 2(s+ 8t)? — 12(s+ 3t)t + 5&2 = 252 — 1322 


REAL SYMMETRIC BILINEAR FORMS. LAW OF INERTIA 


In this section we treat symmetric bilinear forms and quadratic forms on vector 
spaces over the real field R. These forms appear in many branches of mathematics and 
physics. The special nature of R permits an independent theory. The main result follows. 


Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there is a basis of 
V in which f is represented by a diagonal matrix; every other diagonal 
representation has the same number P of positive entries and the same 
number N of negative entries. The difference S=P—WN is called the 
signature of f. : 


A real symmetric bilinear form f is said to be nonnegative semidefinite if 
g(v) = f(v,v) = 0 
for rer vector v; and is said to be positive definite if 
Qo) == 70,0) 0 


for every vector v #0. By the above theorem, 
(i) f is nonnegative semidefinite if and only if S = rank/(f) 
(ii) f is positive definite if and only if S = dimV 


where S is the signature of f. 


Example 12.6: Let f be the dot product on R”; that is, 
flu, v) = uv = a4by + agb, + =-* + dpdn 
where u=(a,;) and v=(b;). Note that f is symmetric since 
f(u,v) = urv =vsu = flv, 4) 
Furthermore, f is positive definite because 


fluu) = a+a2+--- +a, > 0 
when u =~ 0. 


In the next chapter we will see how a real quadratic form q transforms when the transi- 
tion matrix P is “orthogonal”. If no condition is placed on P, then q can be represented 
in diagonal form with only 1’s and —1’s as nonzero coefficients. Specifically, 

Corollary 12.6: Any real quadratic form q has a unique representation in the form 
BG ar te eee Be ee ee 

The above result for real quadratic forms is sometimes referred to as the Law of Inertia 

or Sylvester’s Theorem. 


266 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


HERMITIAN FORMS 
Let V be a vector space of finite dimension over the complex field C. Let f: VxV>C 


be such that (i) flamtbu,v) = af(us,v) + f(us, v) 


(ii) f(u,v) = flv, u) 
where a,b€C and w,v€V. Then f is called a Hermitian form on V. (As usual, k 
denotes the complex conjugate of k €C.) By (i) and (ii), 
f(u, avit+bve) = f(avitbve,u) = af(vi,u) + bf (v2, u) 


= af(vi,u) + bf(ve,u) = af(u,v1) + 5 f(u, v2) 
That is, (iii) f(u,avitbve) = af(u, v1) + bf(u, v2) 


As before, we express condition (i) by saying f is linear in the first variable. On the other 
hand, we express condition (iii) by saying f is conjugate linear in the second variable. Note 
that, by (ii), f(v,v) = f(v,v) and so f(v, v) is real for every v € V. 


Example 12.7: Let A = (a;;) be an » Xn matrix over C. We write A for the matrix obtained by 
taking the complex conjugate of every entry of A, that is, Az (a,j). We also write 
A* for At= At, The-matrix A is said to be Hermitian if A* =A, ie. if aj; = Gp. 
If A is Hermitian, then f(X,Y) = XtAY defines a Hermitian form on C” (Prob- 
lem 12.16). 


The mapping g:V>R defined by qg(v) =f(v,v) is called the Hermitian quadratic form 
or complex quadratic form associated with the Hermitian form f. We can obtain f from 
qg according to the following identity called the polar form of f: 


fu, v) = 4(q(utv) — q(u—v)) + Z(a(u+ iv) — a(u— 1») 


Now suppose {é1,...,é€n} is a basis of V. The matrix H = (hij) where hi = f(ei,e;) is 
called the matrix representation of f in the basis {e:}. By (ii), f(ei:, e)) = f(e;, ei); hence H 
is Hermitian and, in particular, the diagonal entries of H are real. Thus any diagonal rep- 
resentation of f contains only real entries. The next theorem is the complex analog of 
Theorem 12.5 on real symmetric bilinear forms. 


Theorem 12.7: Let f be a Hermitian form on V. Then there exists a basis {é1,..., én} of 
V in which f is represented by a diagonal matrix, ie. f(e,e))=0 for 
i1#~ 7. Moreover, every diagonal representation of f has the same number 
P of positive entries, and the same number WN of negative entries. The 
difference S=P—WN is called the signature of f. 


Analogously, a Hermitian form f is said to be nonnegative semidefinite if 
a(v) = fv, v) = 0 
for every v € V, and is said to be positive definite if 
q(v) = f(%,v) > 0 
for every v +0. 
Example 12.8: Let f be the dot product on C”; that is, 
flu, v) = wry = 2,Wy + eqQWg + +++ + 2,Wy 


where u = (z;) and v =(w;). Then f is a Hermitian form on C*. Moreover, f is 
positive definite since, for any v #0, 


flu, U) = %8y + Aho + s+ + en%_ = |ey|? + lzql? + +--+ + e,|2 > 0 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 267 


Solved Problems 
BILINEAR FORMS 
12.1. Let w= (x1, %2,%3) and v = (y:,y2, ys), and let 
f(u,v) = Bxy1 — 2Zarye + Saoyr + Txoy2 — 8x2Y3 + Ax3Y2 — x3Y3 
Express f in matrix notation. 


Let A be the 3 X 3 matrix whose aj-entry is the coefficient of xy; Then 


oie 0 Y1 
FC 0) et XA =" h(e4,.85,0s) |b oT 8 Yo 
0 4 —1 Y3 


12.2. Let A be an n Xn matrix over K. Show that the following mapping f is a bilinear 
form on—-K°:f(X, Y)=X'*A Y. 
For any a,b€K and any X,,Y,;€ K*, 
flaky bX, VY) = (GX, POX AY “= ax? +oxXtyay 
Ss ON AO AY = Waf(X avi Oe) 
Hence f is linear in the first variable. Also, 
f(X,eY,+ bY.) = XtAGY,+0Y,.) = aXtAY, +°OKtAY, = af(X,Y,) + b4(X, Y,) 


Hence f is linear in the second variable, and so f is a bilinear form on K”. 


12.3. Let f be the bilinear form on R? defined by 
f((a1, x2), (Y1, Y2)) = 22i1Y1 — dX1Yo + Toye 


(i) Find the matrix A of f in the basis {uw = (1,0), we = (1,1)}. 

(ii) Find the matrix B of f in the basis {v1 = (2,1), v2 = (1, —1)}. 

(ili) Find the transition matrix P from the basis {ui} to the basis {vi}, and verify 
that B= 'P' A P: 


(i) Set A = (a) where a,; = f(u;, u,): 


Gy, = f(uy,uy) = F((1,0), (1,0) = 2-0+0 = 2 
Aye = f(uy,U) = f((1,0), (1,1) = 2-840 = —-1 
Go, = f(ue,%) = F(1,1), (1,0) = 2-0+0 = 2 
Ago = f(g, %) = f((1,1),0,1)) = 2-8+1= 0 


Thus A = é oa is the matrix of f in the basis {w,, w}. 


(ii) Set B= (b,;) where 5, = f(v;, v;): 


bi, = f(vyr1) = F((2, 4), 2, D) ars 1G ek Sag 
bis ai (04, V2) cn f((2, 1), (l5=1)) = 4+6-1'=9 
bat = f (V2, V4) = WGs—by (2, 1)) as BE as = Jl ee 0 

6 


boo = f(v2, V9) = HCE =); (1, ==)))) = = 3 + res 


Thus B = ie 2) is the matrix of f in the basis {vj, V9}. 


We must write v, and v, in terms of the 4; 
v, = (2,1) = (1,0) + (1,1) 
Up er (1, —1) = 2(1, 0) = ay 1) = 2uy — Ug 


WN 


(iii 


Uz, + Us 


268 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


Nee 2 bees | 
‘hens and:s0) ) i= 2 thas: 
1 =1 2 -1 


BAP = (hc) 0) ae ee 


12.4. Prove Theorem 12.1: Let V be a vector space of dimension n over K. Let {¢,, ...,¢,} 
be a basis of the dual space V*. Then {fij:71,7=1,...,n} is a basis of B(V) where 
fy is defined by f,,(u, v) = $,(u) ¢,(v). Thus, in particular, dim B(V) = n, 

Let {e,, ...,€,} be the basis of V dual to {¢,;}. We first show that {f,;} spans B(V). Let f © B(V) 
and suppose f(e;,e;) = aj; We claim that f = > a; fi. It suffices to show that f(e,,e,) = 
(2 aajfiz) (ess é,) for s,t=1,...”. We have 

(2 aijfij)(ess &) = Dayfyles.e) = 24 4 (€s) $5 (Ct) 
= Daydsicdje = Ae = flCs, 4) 
as required. Hence {f;;} spans B(V). 

It remains to show that {f;;} is linearly independent. Suppose > 4; fi; —=10, - Fhenw for 

Sebel eens, 10, 


0. = 0(e,, my) a (> aij fi )(ess et) = Ors 
The last step follows as above. Thus {f;;} is independent and hence is a basis of B(V). 


12.5. Let [f] denote the matrix representation of a bilinear form f on V relative to a basis 
{é1,...,@n} of V. Show that the mapping f' [f] is an isomorphism of B(V) onto 
the vector space of n-square matrices. 


Since f is completely determined by the scalars f(e;,e;), the mapping f # [f] is one-to-one and 
onto. It suffices to show that the mapping f + [f] is a homomorphism; that is, that 


[af + bg] = alf] + O[9) (*) 
However, for 1,7 =1,...,n, 
(af + bg)(e;,e;) = af(e, e;) + bg(e;, e;) 


which is a restatement of (*). Thus the result is proved. 


12.6. Prove Theorem 12.2: Let P be the transition matrix from one basis {ei} to another 
basis {e7}. If A is the matrix of f in the original basis {e:}, then B=P‘tAP is the 
matrix of f in the new basis {e’}. 


Let u,v EV. Since P is the transition matrix from {e,;} to {e;}, we have Plul,, = [ul], and 
Plv]e = [v]e3 hence [u]® = [uli P* Thus 
f(u,r) = [u]oA [re]. = [ule PtA Plo]., 


Since u and v are arbitrary elements of V, P‘A P is the matrix of f in the basis {e/}. 


SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 
12.7. Find the symmetric matrix which corresponds to each of the following quadratic 


polynomials: 
(i) g(v,y) = 4x? — Gay — Ty’ (ili) Q(z, y,2z) = 3x7 + day — y? + Buz — Byz + 2 
(ii) Q(z, y) = ay +y¥? (iv) g(%,y,2) = 272 —2y2 + 22 

The symmetric matrix A = (a;;) representing q(x,,...,”,) has the diagonal entry a;, equal to 


the coefficient of x? and has the entries a,; and a;, each equal to half the coefficient of xx; Thus 


CHAP: 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 269 


gre) 4 1 

4 —3 Oment : 2 
ee Gs i aeteees ‘Bea iees| 
y i pane ey oie say 


(i) (ii) (iii) (iv) 


12.8. For each of the following real symmetric matrices A, find a nonsingular matrix P 
such that P'A P is diagonal and also find its signature: 


13372 Gries ye 
ee Ag=" | Sout 5 Gi A = 2) 1 32 92 
Paones eel 


(i) First form the block matrix (A, J): 


$e 3) 423 0 
tea) ats 8 1p 0S tO 
245.683) SOU 0 Sed 


Apply the row operations R,>3R,+R, and R,;>—2R,+R3 to (A, J) and then the corre- - 
sponding column operations Cy > 38C,+C, and C3 > —2C,+ Cs; to A to obtain 


18. 924 A B-- 0 1. 20. G0hin et AO O 
Gite teeter) Oily and then ft 07 Se aay 358 ARO 
Oe ean Cane Nae leat epee ea 


Next apply the row operation RR; R,+2R, and then the corresponding column operation 
C3 =e Cy Se 2C3 to obtain ’ 


1 0 O 159 0ee. 0 | Bers Oat) | rat OW 0) 
Oe Se 3 lee, 0 and then Ge) ! Sia og) eee) 
he Oe Sicen Paese RA eas bo carae es men ini y 
1 =i] TA 00a 0 
Now A has been diagonalized. Set P = |0 1 11]; then P?tAP = |0-2 0 
0 oO 18 
The signature Sof Ais S =2—-1=1. 
(ii) First form the block matrix (A, J): 
OU ale al . A PaO 0) 
(AR) = ibe PA he Ot ed tO) 
App eT Ne 30 eal 


In order to bring the nonzero diagonal entry —1 into the first diagonal position, apply the row 
operation R,<>R; and then the corresponding column operation C;<> Cz to obtain 


| 
Bee ty! 0.0.14 meyer Caren ema Ete bl 
Gee Oo teri). vand then Doaeied Rien a 
eet tO 46 | OL Seskyes heer) 


Apply the row operations R,-> 2R,+R, and R,;>R,+R; and then the corresponding 
column operations Cy 2C,+C, and C3; C,+ C3 to obtain 


owed! 80 1 ee Digan nO epee 
| 

Gauss 10) 1. 924%--gnd then 0 2 810 12 

Gergen les joel 1 eS? ite Pe ei s0) ral 


270 


12.9. 


12.10. 


12.11. 


BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


Apply the row operation R;—>—3R,+2R; and then the corresponding column operation 
C2 = SCH = 2C 3 to obtain 


= Oe: 10 On Oyen =1e 30 0 OO 
(Oey Pte 3 Outed ee 2) and then On 2 0 Onan eae 2 
02087 2 —3 —4 On O14 2 —3 —4 
(Ot Ore =} 0 0 
Now A has been diagonalized. Set P = |0 1 —3]}; then PtAP = Onee2 0 
t—=20-4 One t4 
The signature S of A is the difference S = 1—2 = —1. 


Suppose 1+1+0 in K. Give a formal algorithm to diagonalize (under congruence) 
a symmetric matrix A = (aij) over K. 


Case I: a,,; #0. Apply the row operations R,; > —a,,R,+4,,R; 1=2,...,n, and then the 


a 0 
corresponding column operations C; > —a,,C,+4,,C; to reduce A to the form ( ie es : 


Case II: a,,;=0 but a; 40, for some i>1. Apply the row operation R,<>R; and then the 
corresponding column operation C,<>C;, to bring a,; into the first diagonal position. This reduces 
the matrix to Case I. 


Case Ili: All diagonal entries a;;= 0. Choose 7,7 such that a;;~ 0, and apply the row opera- 
tion R;>R,+R; and the corresponding column operation C;>C,;+C; to bring 2a,;~0 into the 
ith diagonal position. This reduces the matrix to Case II. 


a 0 
In each of the cases, we can finally reduce A to the form ( 4: ) where B is a symmetric 
matrix of order less than A. By induction we can finally bring A into diagonal form. 


Remark: The hypothesis that 1+1+0 in K, is used in Case III where we state that 2a; ~ 0. 


Let q be the quadratic form associated with the symmetric bilinear form f. Verify 
the following polar form of f: f(u,v) = 4(q(u+v) — g(u) — gq(v)). (Assume that 
1+1+#0.) 

g(u+ v) — g(x) — q(v) flutv, ut v) — flu, u) — f(r, v) 
f(u,u) + flu, v) + flv, u) + fv, v) — flu, u) — flv, v) 
== 2f(u, v) 


If 1+1+#0, we can divide by 2 to obtain the required identity. 


\| 


Prove Theorem 12.4: Let f be a symmetric bilinear form on V over K (in which 
1+1+0). Then V has a basis {v1,...,¥Vn} in which f is represented by a diagonal 
matrix, i.e. f(v:,v;))=0 for 47. 


Method 1. 


If f=0 or if dimV=1, then the theorem clearly holds. Hence we can suppose f ~0 and 
dimV=n>1. If qv) =f(v,v) =0 for every v EV, then the polar form of f (see Problem 12.10) 
implies that f = 0. Hence we can assume there is a vector v,@V_ such that f(v,,v,) #0. Let 
U be the subspace spanned by v, and let W consist of those vectors v€V_ for which f(v,,v) = 0. 
We claim that V=UQ W. 


(i) Proof that UN W = {0}: Suppose ue UNW. Since wEU, u= kv, for some scalar k € K. 
Since we W, 0 = f(u,u) = f(kvy, kvy) = k? f(vy, vy). But f(vy, v1) ¥ 0; hence k = 0 and there- 
fore u=kv,=0. Thus UnNW = {0}. 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 271 


12.12. 


12.13. 


(ii) Proof that V=U+W: Let vEV. Set 


fay F(v4, v) 
SSR see F (V4, V4) * ” 
Then (Op) = flo) — FP?) joy 0 — aU 
Vy 


Thus w € W. By (1), v is the sum of an element of U and an element of W. Thus V=U+W. 
By (i) and (ii), V=UQ W. 


Now f restricted to W is a symmetric bilinear form on W. But dimW = n—1; hence by 


induction there is a basis {vo,..., Vy} of W such that f(v;, v;)=0 for i#j and 2=i1,j=n. But 


by the very definition of W, f(v4,v;) =0 for j =2,...,n. Therefore the basis Wikis cgay? Crt 1 
has the required property that f(v,, v;)=0 for 1#j. 
Method 2. 


The algorithm in Problem 12.9 shows that every symmetric matrix over K is congruent to a 
diagonal matrix. This is equivalent to the statement that f has a diagonal matrix representation. 


Let A = ne “fe , a diagonal matrix over K. Show that: 
An 


(i) for any nonzero scalars ki,...,kn€K, A is congruent to a diagonal matrix 

_ with diagonal entries aik?; 

(ii) if K is the complex field C, then A is congruent to a diagonal matrix with only 
1’s and 0’s as diagonal entries; 

(iii) if K is the real field R, then A is congruent to a diagonal matrix with only 
1’s, —1’s and 0’s as diagonal entries. 


(i) Let P be the diagonal matrix with diagonal entries k;. Then 


ky ay ky ak; 
PtAP = ke ds Ie = aka 
kn Gn ken daly 
Va; if a,40 
(ii) Let P be the diagonal matrix with diagonal entries b; = | 1 : ¢ a aye Then PtAP has 
:= 


the required form. 


Then PtaAP has 


8 fee if a; ~0 


(iii) Let P be the diagonal matrix with diagonal entries b; = 1 ea tig 
the required form. + 
Remark. We emphasize that (ii) is no longer true if congruence is replaced by Hermitian con- 

gruence (see Problems 12.40 and 12.41). 


Prove Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there 
is a basis of V in which f is represented by a diagonal matrix, and every other diagonal 
representation of f has the same number of positive entries and the same number of 


negative entries. 

By Theorem 12.4, there is a basis {u,...,Un} of V in which f is represented by a diagonal 
matrix, say, with P positive and N negative entries. Now suppose {w,,...,W,} is another basis of 
V in which f is represented by a diagonal matrix, say, with P’ positive and N ; negative entries. Bure 
can assume without loss in generality that the positive entries in each matrix appear first. Since 
rank (f) = P+N=P’+N’, it suffices to prove that .P = P’. 


272 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


Let U be the linear span of w,,...,up and let W be the linear span of wp.4,,...,W,. Then 
f(v,v) > 0 for every nonzero v € U, and f(v,v) = 0 for every nonzero v € W. Hence 
UtywW = {0¥. Note that dimU =P and dimW =n —Ff Thus 

dim(U+W) = dimU + dimW — dim(UNW) = P+ (n—-P’)—-0 = P-P’'+n 
But dim (U +.W)=dim V =x; hence P—P’ +n =n or P= P- ee P’ =P _ and there- 
fore P =P’, as required. 

Remark. The above theorem and proof depend only on the concept of ee Thus the 
theorem is true for any subfield K of the real field R. 


12.14. An nxn real symmetric matrix A is said to be positive definite if X‘AX >0 for 
every nonzero (column) vector X €R", ie. if A is positive definite viewed as a 
bilinear form. Let B be any real nonsingular matrix. Show that (i) B’B is sym- 
metric and (ii) B‘B is positive definite. 

(i) (BtB)t = BtBtt = BtB; hence BtB is symmetric. ‘ 

(ii) Since B is nonsingular, BX #0 for any nonzero X € R”. Hence the dot product of BX with 
itself, BX + BX = (BX)t(BX), is positive. Thus Xt(BtB)X = (Xt Bt)(BX) = (BX)'*(BX) > 0 as 
required. 


HERMITIAN FORMS 
12.15. Determine which of the following matrices are Hermitian: 


2 2-84. A= 5: Sov eA 4 +38 i: 
DES, 5 6 + 2i Dyn ee i Ba Aa | 
ARE Gs Oe 7 Aes ZS 5 1 -6| 


(i) (ii) (iii) 
A matrix A = (a;) is Hermitian iff A = A*, ie. iff aj = Gj. 
(i) The matrix is Hermitian, since it is equal to its conjugate transpose. 


(ii) The matrix is not Hermitian, even though it is symmetric. 
(iii) The matrix is Hermitian. In fact, a real matrix is Hermitian if and only if it is symmetric. 


12.16. Let A be a Hermitian matrix. Show that f is a Hermitian form on C” where f is 

defined by f(X,Y)=X‘AY. 

For all a,b €C and all X,,X5,Y EC’, 
f(aX,+ bXy, Y) = (GX, OX AY =| (eX aby AY 
aX Ax FH OXGAY = eyed b Axe) 
Hence f/f is linear in the first variable. Also, 
F(X, Y) = XtAY = (XtAY)t = YeAtX = Yta*X S VtaAR = f(y,X) 
Hence f is a Hermitian form on C”. (Remark. We use the fact that XtA Y is a scalar and so it is 
equal to its transpose.) 


12.17. Let f be a Hermitian form on V. Let H be the matrix of f in a basis {é1,...,é€n} of 
V. Show that: 
(i) f(u,v) = [uj fv]. for all u,v EV; 
(ii) if P is the transition matrix from {ei} to a new basis {ei} of V, then B= P*HP 
(or: B=Q*HQ where Q =P) is the matrix of f in the new basis {e/}. 
Note that (ii) is the complex analog of Theorem 12.2. 


(i) Let u,v €V and suppose 
U = Ayey t+ Ageg tess +an,e, and v = bye; + boeo+ -*> + d,e, 


CHAP. 12] 


BILINEAR, QUADRATIC AND HERMITIAN FORMS 


273 
Then f(u,v) = flayey toes + Onn, Oye, + >= +.bye,) 
by 
= = adj EN Os) = AGyy ss ot) be = [aH [ely 
b 


as required. 
(il) 
Pluje = [ule, Plrle = [re 


Thus by (i), f(u, v) = [u]£H fo], = 


and so 


[uo Pt HP [v],.. 


Since P is the transition matrix from {e;} to {e/}, then 


[we = [ule Pt, [ole =P lr]. 


But uw and v are arbitrary elements of V; 


hence Pt H P is the matrix of f in the basis {e;}. 


1 et 20 
12-18. Let -H- = |. 1-4 4 aoe 
SLY) SrA) 7 


trix P such that P‘HP is diagonal. 
First form the block matrix (H, J): 


, a Hermitian matrix. 


Find a nonsingular ma- 


Te Li tai 
1-i 4 2-810 1 0 
aA) ee ae aCe Ee ie a 


Apply the row operations R, > (—1+7)R,+R, 
corresponding “Hermitian column operations” 


(see Problem 12.42) 


and R3,—>2iR,+R, to (A,J) and then the 


Cy = (= —_ NC a Cy and 


C3 > —21C;+ C3 to A to obtain 
deg eae acd 080 1 OU teal gree 220 
0 2 —6i | -1+4 1.0] -andthen |0 2 —5i; -1+4 1 0 
(Mergoa eS ee Sic Oe I O.ieOie 8 ure ea Ose an Oe 


Next apply the row operation R;—>—5ik,+2R, and the corresponding Hermitian column opera- 


tion C3; > 5i1C,+2C3 to obtain 


1 0 0 1 0 0 1. 0 0 1 0 0 
0 2 —52 ! .—-1+2 1°: 0 and then Ome2 0 =e. el 20 
iT he ana eee ti 020-881) bt 90 bee 
Now H has been diagonalized. Set 
1 -1+i 549i LEO 0 
Pre) 0 i) —5i andthen P*HP = |0 2 O 
0 0 Ye OOS 


Observe that the signature S of His S=2—-1=1. 


MISCELLANEOUS PROBLEMS 


12.19. Show that any bilinear form f on V is the sum of a symmetric bilinear form and a 


skew symmetric bilinear form. 
Set g(u,v) = 
g(u,v) = §fflu,v) + f(r, w) = 

and h is skew symmetric because 
h(u,v) = 4[f(u,v) — fv, u)] 


Furthermore, f=gth. 


A[f(u, v) + f(v,u)] and h(u, v) = 4[f(u, v) — 


f(v,u)]. Then g is symmetric because 
Alf(v, u) + fu, »)] = g(r, u) 
—AL[fF(v,u) — fur] = —h(v, x) 


274 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


12.20. Prove Theorem 12.3: Let f be an alternating bilinear form on V. Then there exists 
a basis of V in which f is represented by a matrix of the form 


0220. 
aoe iy Wed art 
aoe 
| | 
ie ras 0 : 
(COs 
ieee | 
Me me ) aif Sees 
] 
Sut i 
| we 
es Pallee 
0° 
Moreover, the number of : a is uniquely determined by f (because it is equal 
to $[rank (f)]). = 


If f = 0, then the theorem is obviously true. Also, if dimV = 1, then f(k,u,k.u) = 
kiko f(u,u) =0 andso f=0. Accordingly we can assume that dimV>1 and f #0. 


Since f #0, there exist (nonzero) wuy,u,<&V such that f(u,,w.) AO. In fact, multiplying 
u, by an appropriate factor, we can assume that f(uy,u.)=1 and so f(uo, uy) = —1. Now uw, and 
Uy are linearly independent; because if, say, wu. = ku,, then f(uy, us) = f(uy,kuy) = k f(uy, uy) = 0. 
Let U be the subspace spanned by wu, and uo, ie. U = L(uy, ug). Note: 


(i) the matrix representation of the restriction of f to U in the basis {w1, uo} is a . 5 
(ii) if we&U, say u=au,+ bug, then 
fu, Uy) = f(auy + bug, uy) = —b 
f(U, Ug) = flan, + buy, Uy) = a 
Let W consist of those vectors w€V_ such that f(w,u,;)=0 and f(w,u,) = 0. Equivalently, 
W = {wev: f(w,u) =0 for every uc U} 


We claim that V=U@QW. It is clear that Un W = {0}, and so it remains to show that 
V=U-+W. Let vEV. Set 

U = f(V, Ug)uy — flv, Uy)Ug and Wes OB (2) 
Since u is a linear combination of u,; and us, u€ U. We show that w€&W. By (1) and (ii), 
f(u, U4) = f(v,u,4); hence 
0 


f(w, Uy) oa flv — Uy, uy) = f(r, Uy) oa flu, Uy) 
Similarly, f(u,w%.) = f(v,u,) and so 
f(W, Ug) = f(v—U, Uy) = flv, Ug) — flu, uy) = 0 


Then w€W and s0, by (1), v=u+w where wGU and w€ W. This shows that V=U+W; 
and therefore V=U QW. 


Now the restriction of f to W is an alternating bilinear form on W. By induction, there exists 
a basis uU3,...,U, of W in which the matrix representing f restricted to W has the desired form. 
Thus Uz, Ug, Ug,...,Uy, is a basis of V in which the matrix representing f has the desired form. 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 275 


Supplementary Problems 


BILINEAR FORMS 


12.21. 


12.22. 


12.23. 


12.24. 


12.25. 


12.26. 


12.27. 


12.28. 


Let u = (%1,%) and v = (y;,y). Determine which of the following are bilinear forms on R2: 


(i) f(u,v) = ay. — BX oY (iv) f(u,v) = wx. + yiYs 
(ii) f(u,v) = 21+ Ye (Vhs, a) 1 
(ili) f(u,v) = Sxoy5 (Wa) HAG DO) == OF 


Let f be the bilinear form on R2 defined by 
(545%), (Y1,Y2)) = Bayyy — 2ayyo + Say, — VoYo 
(i) Find the matrix A of f in the basis {u, = (1,1), Wy — (2) 
(ii) Find the matrix B of f in the basis {v, = (1,—1), vy = (8, 1)}. 
(iii) Find the transition matrix P from {u;} to {v;} and verify that B= PtAP. 


eae, ai 
3 ae and let f(A,B) = 


tr(A'MB) where A,BEV and “tr” denotes trace. (i) Show that f is a bilinear form on V. 
(ii) Find the matrix of f in the basis ae 4 4 ; 0 0 ; Op : 
Om 0 0 AsO eal 


Let B(V) be the set of bilinear forms on V over K. Prove: 


(Gi) if f,g & B(V), then f+g and kf, for k€& K, also belong to B(V), and so B(V) is a subspace 
of the vector space of functions from V X V into K; 


Let V be the vector space of 2X2 matrices over R. Let M = ( 


(ii) if ¢ and o are linear functionals on V, then f(u,v) = 4(u) o(v) belongs to B(V). 


Let f be a bilinear form on V. For any subset S of V, we write 

st= {vEV: f(u,v) =0 for every wES}, S' = (EV: f(v,u) =0 for every uES} 
Show that: (i) St and S! are subspaces of V; (ii) S;CS, implies Sve Se and Sie si ; 
Gii) £037 = {0}? =YV. 


Prove: If f is a bilinear form on V, then rank(f) = dimV — dim vt = dimV-—dimv' and 
hence dimV+ =dimV'. 


Let f be a bilinear form on V. For each wEV, let u:V>K and u:V—>K be defined by 
uu (a) = f(a,u) and u(x) = f(u,x). Prove: 

(i) u and 4 are each linear, i.e. u,u ev"; 

(Cid) “Fiat u and ub w are each linear mappings from V into V*; 


(iii) rank (f) = rank (u bu) =) rank (2 bu). 


Show that congruence of matrices is an equivalence relation, i.e. (1) A is congruent to A; (ii) if A 
is congruent to B, then B is congruent to A; (iii) if A is congruent to B and B is congruent to C, 


then A is congruent to C. 


SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 


12.29. 


Find the symmetric matrix belonging to each of the following quadratic polynomials: 
(i) g(a,y,2) = 20? — Bay + y2 — 16az + 14yz + 52? 

(ii) g(x,y, 2) 02 — wz + y? 

(iii) g(a, y, 2) ay + y? + daz t+ 2 

(iv) q(u,y,2) = wy + yz. 


lI 


276 


12.30. 


12.31. 


12.32. 


12.33. 


12.34, 


12.35. 


12.36. 


BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


For each of the following matrices A, find a nonsingular matrix P such that PtAP is diagonal: 


Ue ee 8 
DAS hike 1°25 S41 

A a ee = ths ==) ss = 
(i) A i aah Gaile, vA : : : ie abn ha 72 Aighwig Sig tang 
= 3 fli Ose 


In each case find the rank and signature. 


Let q be the quadratic form associated with a symmetric bilinear form f. Verify the following 
alternate polar form of f: f(u,v) = 1[q(u+v) — qu—v)}. 


Let S(V) be the set of symmetric bilinear forms on V. Show that: 
(i) S(V) is a subspace of B(V); (ii) if dimV =n, then dim S(V) = dn(n + 1). 


Let f be the symmetric bilinear form associated with the real quadratic form gq(x,y) = 
ax? + bay + cy?. Show that: 


(i) f is nondegenerate if and only if 62—4ac 4 0; 
(ii) f is positive definite if and only if a >0 and 6b2—4ac <0. 


Suppose A is a real symmetric positive definite matrix. Show that there exists a nonsingular matrix 
P=such, that Ay — 2 "Ps 


n 
Consider a real quadratic polynomial q(#,,...,%,) = > a0"; Where a; = aj. 
ij=1 
(i) If a,,; #0, show that the substitution 
1 
eq) eee UA oe a eee Bare inYn)> %Q = Ya; oney Lyn = Yn 
yields the equation gq(x,,...,%,) = Anyi + q'(Yo,+-+»Yn), Where q’ is also a quadratic 


polynomial. 
(ii) If ay; =0 but, say, ay. #0, show that the substitution 
1 =Yrt Yo, Xo = Yy— Yo, %3 = Yar +++ Xn = Yn 
yields the equation 9(x1,..-,%,) = 3 biyy;, where 6,;~0, i.e. reduces this case to case (i). 


This method of diagonalizing q is known as ‘‘completing the square”. 


Use steps of the type in the preceding problem to reduce each quadratic polynomial in Problem 
12.29 to diagonal form. Find the rank and signature in each case. 


HERMITIAN FORMS 


12.37. 


12.38. 


12.39. 


12.40. 


For any complex matrices A,B and any k€C, show that: 
(i) A+B=A+B, (i) kA=kA, Gii) AB=AB, (Gv) At= At 


For each of the following Hermitian matrices H, find a nonsingular matrix P such that PtHP is 
diagonal: 
: : 1 a BSP 7 
; 14 * 1 2+ 31 es 3 ; 
Gi) el = = Ae Gi); = eee ay ) Gi) 2 = =) PA ID 
YR IS 2 


Find the rank and signature in each case. 


Let A be any complex nonsingular matrix. Show that H = A*A is Hermitian and _ positive 
definite. 


We say that B is Hermitian congruent to A if there exists a nonsingular matrix Q such that 
B=Q*AQ. Show that Hermitian congruence is an equivalence relation. 


CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 21% 


12.41. 


Prove Theorem 12.7: Let f be a Hermitian form on V. Then there exists a basis {e,,...,e,} of V 
in which f is represented by a diagonal matrix, i.e. f(e;,e;) = 0 for 14 j. Moreover, every diaponat 
representation of f has the same number P of positive entries and the same number N of negative 
entries. (Note that the second part of the theorem does not hold for complex symmetric bilinear 
forms, as seen by Problem 12.12(ii). However, the proof of Theorem 12.5 in Problem 12.13 does 
carry over to the Hermitian case.) 


MISCELLANEOUS PROBLEMS 


12.42. 


12.43. 


12.44. 


Consider the following elementary row operations: 
[a,] RR; [ag] Ri >kR, k +0, [as] Rj >kR;+ RP; 

The corresponding elementary column operations are, respectively, 
[63] Ci; C;, — [b.] Cp > kC,, k £0, [03] Ci, > kC;+C; 

If K is the complex field C, then the corresponding Hermitian column operations are, respectively, 
[e)] C;<> C;, — [es] Cy >kC, k~0, | [es] Cy> kC;+C; 


(i) Show that the elementary matrix corresponding to [b;] is the transpose of the elementary 
matrix corresponding to [aj]. 


(ii) Show that the elementary matrix corresponding to [cj] is the conjugate transpose of the ele- 
mentary matrix corresponding to [aj]. 


Let V and W be vector spaces over K. A mapping f: VX W->K is called a bilinear form on V 
and W if: 

(i) flav, + bv, w) = af(vy,w) + bf (v2, w) 

(ii) f(v, awy+ bw) = af(v, wy) + bf(v, we) 
for every a,b © K, v; EV, w;€ W. Prove the following: 


(i) The set B(V, W) of bilinear forms on V and W is a subspace of the vector space of functions 
from V X W into K. 

(ii) If {¢;,---5¢m} is a basis of V* and {o1,...,¢,} is a basis of W*, then {f,: 7=1,.-.,m, 
j=1,...,n} is a basis of B(V,W) where f;; is defined by fjj;(v,w) = $;(v) oj;(w). Thus 
dim B(V, W) = dim V + dim W, 


(Remark. Observe that if V = W, then we obtain the space B(V) investigated in this chapter.) 


m times 
FN 
Let V be a vector space over K. A mapping f: VXVX:-:XV->K is called a multilinear (or: 
m-linear) form on V if f is linear in each variable, ie. for 7=1,...,m, 
Vege een N= 0 Fetes i) oe OH ecs Os wee) 


where * denotes the ith component, and other components are held fixed. An m-linear form f is 


said to be alternating if 
f(vy, «+> Um) = 0 whenever %=%, t#k 


Prove: 
(i) The set B,,(V) of m-linear forms on V is a subspace of the vector space of functions from 


VxVx-:-xX¥V into K. 
(ii) The set A,,(V) of alternating m-linear forms on V is a subspace of B,,(V). 


Remark 1. If m= 2, then we obtain the space B(V) investigated in this chapter. 


Remark 2. If V = K™, then the determinant function is a particular alternating m-linear form on V. 


278 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 


Answers to Supplementary Problems 


12.21. (i) Yes (ii) No (iii) Yes (iv) No (v) No’ (vi) Yes 


0 —4 3 5 
12,22. (i) A =(5 " (ii) B= (6 ) (iii) P= eos 


Tie eo. at0 
0? 
12.23, (ii) |° 
OAT keri ay 
60.8". Oma 
Dd es ie cross 0 - 2 Ge SuE0 
19.29, 2G) Was ye 7 cca Comte ae Gi) f° 1-0 Gv) a 0 g 
28 1104 20 aati Om Aad 
Rees PFO 
12.30. = t = = 
2.30. (i) P G } PtAP & oe Si 0 
fete F< 0 bb Ove 0 
Gije GP 08 le Sl Rha oS 100 c2 sO ee at 
nad 2 0 0 —38 
awe he 1 30000 
Oude 3-418 Or e166 
ay Jee PA Pe = 
nD O00 G25 19 ace Ghee as 
OnO Lk OerT 0.20 0 469 
: Ie 1 0 
12.38. P = POE ee Es 
(i) 6 ) & ae Se 
- 112-483 = Fite 
ii) P= Purp = 2 
a le 1 f Ber See 
‘ily pees aa 0 
(iis (Pe Ore i OS Pt BOP nO Ties Oa ened 
GusOlas oot 0 | 


Chapter 13 


Inner Product Spaces 


INTRODUCTION 


The definition of a vector space V involves an arbitrary field K. In this chapter we 
restrict K to be either the real field R or the complex field C. In the first case we call V a 
real vector space, and in the second case a complex vector space. 


Recall that the concepts of “length” and “orthogonality” did not appear in the investiga- 
tion of arbitrary vector spaces (although they did appear in Chapter 1 on the spaces R” and 
C"). In this chapter we place an additional structure on a vector space V to obtain an 
inner product space, and in this context these concepts are defined. 


We emphasize that V shall denote a vector space of finite dimension unless otherwise 
stated or implied. In fact, many of the theorems in this chapter are not valid for spaces of 
infinite dimension. This is illustrated by some of the examples and problems. 


INNER PRODUCT SPACES 
We begin with a definition. 


Definition: Let V be a (real or complex) vector space over K. Suppose to each pair of 
vectors u,v € V there is assigned a scalar (u,v) © K. This mapping is called 
an inner product in V if it satisfies the following axioms: 


[Zi] (aur + due, v) = alu, v) + (U2, v) 

[Iz] (u,v) = (v,U) 

[Is] (u,u)=0; and (u,u) = 0 if and only if u=0. 

The vector space V with an inner product is called an inner product space. 


Observe that (u,v) is always real by [2], and so the inequality relation in [J3] makes 
sense. We also use the notation 
ll“|] = Vi, &) 


This nonnegative real number ||x|| is called the norm or length of u. Also, using [J] and 
[J2] we obtain (Problem 13.1) the relation 

(u, av1+ bv2) = Gu, v1) + blu, ve) 
If the base field K is real, the conjugate signs appearing above and in [I2] may be ignored. 


In the language of the preceding chapter, an inner product is a positive definite sym- 
metric bilinear form if the base field is real, and is a positive definite Hermitian form if the 
base field is complex. 

A real inner product space is sometimes called a Euclidean space, and a complex inner 
product space is sometimes called a unitary space. 


279 


280 


Example 13.1: 


Example 13.2: 


Example 13.3: 


Example 13.4: 


Example 13.5: 


INNER PRODUCT SPACES [CHAP. 13 — 


Consider the dot product in R”: 
U?VU = a,b, ar Agby + Gite ae Andn 


where w= (a; and v=(b,;). This is an inner product on R", and R” with this 
inner product is usually referred to as Euclidean n-space. Although there are 
many different ways to define an inner product on R® (see Problem 13.2), we shall 
assume this inner product on R” unless otherwise stated or implied. 


Consider the dot product on C”: 
UV = ZW, + ZWo + hie! ae ZnW n 


where u = (z;) and v = (w,). As in the real case, this is an inner product on C” 
and we shall assume this inner product on C” unless otherwise stated or implied. 


Let V denote the vector space of m Xn matrices over R. The following is an inner 


product in V: 
(A,B) = tr (BtA) 


where tr stands for trace, the sum of the diagonal elements. 


Analogously, if U denotes the vector space of m Xn matrices over C, then the 
following is an inner product in U: 


(A,B) = tr(B*A) 


As usual, B* denotes the conjugate transpose of the matrix B. 


Let V be the vector space of real continuous functions on the interval a = t= b. 
Then the following is an inner product on V: 


b 
ues f f(t) g(t) dt 


Analogously, if U denotes the vector space of complex continuous functions on the 
(real) interval a = t=, then the following is an inner product on U: 


b 
Gas f f(t) @ at 


Let V be the vector space of infinite sequences of real numbers (a1, @», ...) satisfying 
— 2 2 2 

Oe =. ee Og. hae 

=I 


ie. the sum converges. Addition and scalar multiplication are defined component- 
wise: 
(a, Ap, .- a) oh (by, by, oie By, = (ay at by, y+ bo, mis .) 


k(a4,d9,...) = (kay, kao, ...) 
An inner product is defined in V by 
(ay, Mg, ...), (by, b2,...)) = aby + Goby + -- 


The above sum converges absolutely for any pair of points in V (Problem 13.44); 
hence the inner product is well defined. This inner product space is called /-space 
(or: Hilbert space). 


Remark 1: If ||v/| =1, ie. if (v,v)=1, then v is called a unit vector or is said to be 
normalized. We note that every nonzero vector u €V_ can be normalized by 
setting v = u/|lull. 


Remark 2: The nonnegative real number d(u,v) = ||v—ul| is called the distance between 


u and v; this function does satisfy the axioms of a metric space (see Problem 
13.51). 


CHAP. 13] INNER PRODUCT SPACES 281 


CAUCHY-SCHWARZ INEQUALITY 


The follo ‘i 
A Se eg ormula, called the Cauchy-Schwarz inequality, is used in many branches 


Theorem 13.1: (Cauchy-Schwarz): For any vectors u,v € Ve 
[u,v] = [feel] [lo] 


Next we examine this inequality in specific cases. 


Example 13.6: Consider any complex numbers 4a,,...,@,,0;,...,6,€€C. Then by the Cauchy- 
Schwarz inequality, 


(a,b, rigpee 282 GnB,)? ee ile test. |a,.|?)(|b,|? eaters |b,,|?) 
that is, (uv)? = ||eel|? |[o]|? 
where u = (a;) and v = (b)). 


Example 13.7: Let f and g be any real continuous functions defined on the unit interval 0 = t =1. 
Then by the Cauchy-Schwarz inequality, 


car = (S40 goat) = ff reo ae Foy ae = 19) al 


Here V is the inner product space of Example 13.4. 


ORTHOGONALITY 


Let V be an inner product space. The vectors u,v © V are said to be orthogonal if 
Uy — 0. The relation is clearly symmetric; that is, if uw is orthogonal to v, then (v,u) = 
(u,v) =0=0 and so v is orthogonal to u. We note that 0€V_ is orthogonal to every 


v €V for 
Oye — 7 (00,0) — Ov, 0). == 0 


Conversely, if u is orthogonal to every v EV, then (u,w)=0 and hence u=0 by [Js]. 
Now suppose W is any subset of V. The orthogonal complement of W, denoted by Ww 
(read “W perp”) consists of those vectors in V which are orthogonal to every w © W: 
W+ = WeEV: (v,w)=0 forevery w € W} 
We show that W+ is a subspace of V. Clearly, 0 € WwW. Now suppose u,v € W+. Then 
for any a,b €K andany wEW, 
(au+ bv, w) = atu,w) + biv,w) = a:04+6°0 = 0 


Thus au+bv € W+ and therefore W is a subspace of V. 


Theorem 13.2: Let W be a subspace of V. Then V is the direct sum of W and Ww, i.e 
V=Wwew-. 
Now if W is a subspace of V, then V=W® W+ by the above theorem; hence there is 
a unique projection Ew: V>V_ with image W and kernel 
Wt. That is, if v€V and v=wt+w’, where wEW, 
w’ © W+, then Ey is defined by Ew(v) =w. This mapping 
Ew is called the orthogonal projection of V onto W. 


Example 13.8: Let W be the z axis in R’, ie. 
W = HO) Ove)crc SR} 
Then W+ is the wy plane, i.e. 
w+ = {(a,b,0): 4,b€R} 


ub 
As noted previously, R? = W@W. The 
orthogonal projection EZ of R* onto W is given 
by E(x,y,2) = (0,0,z). 


282 


Example 13.9: 


INNER PRODUCT SPACES [CHAP. 13 


Consider a homogeneous system of linear equations over R: 


Ay4%1 + AyQ%o Sp oe ae AynXn = 0 
AX, ae Ag9%o soe oe AgnX%, = 0 
iny cles Apolo oe ate Oya Cnt lO, 


or in matrix notation AX =0. Recall that the solution space W may be viewed 
as the kernel of the linear operator A. We may also view W as the set of all vectors 
v = (#1,...+,%,) Which are orthogonal to each row of A. Thus W is the orthogonal 
complement of the row space of A. Theorem 138.2 then gives another proof of the 
fundamental result: dim W = mn — rank (A). 


Remark: If V is a real inner product space, then the angle @ between nonzero vectors 
u,v € V is defined by 


(U,V) 


CO Tal tal 


By the Cauchy-Schwarz inequality, —1=cos@=1 and so the angle 6 always 
exists. Observe that uw and v are orthogonal if and only if they are ‘‘perpendicu- 
lar”, ie. 6 = 7/2. 


ORTHONORMAL SETS 


A set {ui} of vectors in V is said to be orthogonal if its distinct elements are orthogonal, 
ie. if (wi,u;))=0 for i~7. In particular, the set {wi} is said to be orthonormal if it is 
orthogonal and if each u; has length 1, that is, if 


0 for 14j 


Ui,Uj) = 8% = gion 
i? : i Loris. =37 


An orthonormal set can always be obtained from an orthogonal set of nonzero vectors by 
normalizing each vector. 


Example 13.10: 


Example 13.11: 


Consider the usual basis of Euclidean 3-space R3: 


{ey = (1, 0, 0), P) = (0, 1B 0), e3 = (0, 0, 1)} 
It is clear that 
(€1, €1) = (€9, a) = (€3,€3) =1 and (@,e;) =0 for 1#j 
That is, {e,, 9, e3} is an orthonormal basis of R®. More generally, the usual basis 


of R” or of C” is orthonormal for every n. 


Let V be the vector space of real continuous functions on the interval —7 =t=7 
T 

with inner product defined by (f,g) = i f(t) g(t) dt. The following is a classi- 
=a 


cal example of an orthogonal subset of V: 
{1s COS/E,"COSuaE, cath SI Gaesinys Geieeay 


The above orthogonal set plays a fundamental role in the theory of Fourier series. 


The following properties of an orthonormal set will be used in the next section. 


Lemma 13.3: An orthonormal set {w, ...,u,} is linearly independent and, for any v © V 


’ 


the vector 


w= V— (WV, Un) — (VY, U2)U2 — 6 = (YU, Ur) Ur 


is orthogonal to each of the uj. 


_— 


CHAP. 13] INNER PRODUCT SPACES 283 


GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 


Orthonormal bases play an important role in inner product spaces. The next theorem 


shows that such a basis always exists; its proof uses the celebrated Gram-Schmidt orthog- 
onalization process. 


Theorem 13.4: Let {v1,..., Vn} be an arbitrary basis of an inner product space V. Then 
there exists an orthonormal basis {w, ..., Un} Of V such that the transition 
matrix from {v;} to {w;} is triangular; that is, for i=1,...,n, 

Ui = AinVi + AiVe + --- + Auv; 


Proof. We set uw: = v1/|\v:||; then {v1} is orthonormal. We next set 
We = V2—(V2,U1)t. and Us = We/||wol| 


By Lemma 13.3, w2 (and hence wz) is orthogonal to wm; then {21, u2} is orthonormal. We next 
set 


W3 = V3 — (Us, U1)U1 — (Us, U2)U2 and U3 = ws/||wWsl| 
Again, by Lemma 13.3, ws (and hence uz) is orthogonal to uw and w%2; then {W1, U2, us} is ortho- 
normal. In general, after obtaining {w,...,ui} we set 
Witt = Vit1— Wisi, Ui) — +++ — (Visi, U)U and U+1 = Wisi/||wisil| 


(Note that wi+1%0 because visi L(u,...,vi) = L(m,...,ui).) As above, {m,.. oy Usa} 
is also orthonormal. By induction we obtain an orthonormal set {wi,...,%m} which is in- 
dependent and hence a basis of V. The specific construction guarantees that the transition 
matrix is indeed triangular. 


Example 13.12: Consider the following basis of Euclidean space R?: 
{v, = (1,1,1), ve = (0,1,1), vg = (0,0, 1)} 


We use the Gram-Schmidt orthogonalization process to transform {v;} into an ortho- 
normal basis {u;}. First we normalize 4, i.e. we set 


| as Sp ON reat te) ja! fat A 
; Ileal v3 mes 
Next we set 


eZ 1 al 1 paket 
Won i Ul ee (Vo, Uyz)Uy 3 (0, 1, 1) ( ? , ) = (-2. 3” ) 


and then we normalize wg, i.e. we set 


eae a ae wR) 
||| ve v6 V6 
Finally we set 
Wz = Vz — (Vg, Uy)Uy — (Vg, Ug)Uy 
a a Se ee 
ee te ea) asl vee Vs) 2"2 
and then we normalize ws: 
A a as teas Cara 
[hell cena 


The required orthonormal basis of R? is 


{=(Jededs) == (Gevese): «eh 


lI 


LINEAR FUNCTIONALS AND ADJOINT OPERATORS 
Let V be an inner product space. Each wu € V determines a mapping u:V->K defined by 


u(v) = (v,u) 


284 INNER PRODUCT SPACES [CHAP. 13 


Now for any a,b € K and any 1,2 € V, 

ular + bv2) = (avit+ bve, u) = a(v1,U) + b(v2,u) = au(v1) ar bu(v2) 
That is, % is a linear functional on V. The converse is also true for spaces of finite dimen- 
sion and is an important theorem. Namely, 


Theorem 13.5: Let ¢ be a linear functional on a finite dimensional inner product space V. 
Then there exists a unique vector u € V such that ¢(v) = (v,u) for every 
aes: 


We remark that the above theorem is not valid for spaces of infinite dimension (Problem 


13.45), although some general results in this direction are known. (One such famous result 
is the Riesz representation theorem.) 


We use the above theorem to prove 


Theorem 13.6: Let T be a linear operator on a finite dimensional inner product space V. 
Then there exists a unique linear operator T* on V such that 
(T(u),v) = (u, T*(v)) 
for every u,v €V. Moreover, if A is the matrix of T relative to an 


orthonormal basis {e:} of V, then the conjugate transpose A* of A is the 
matrix of T* in the basis {e;}. 


We emphasize that no such simple relationship exists between the matrices representing 
T and T* if the basis is not orthonormal. Thus we see one useful property of orthonormal 
bases. 


Definition: A linear operator T on an inner product space V is said to have an adjoint 
operator T* on V if (T(u),v) = (u,T*(v)) for every u,v € V. 


Thus Theorem 13.6 states that every operator T has an adjoint if V has finite dimension. 
This theorem is not valid if V has infinite dimension (Problem 13.78). 


Example 13.13: Let T be the linear operator on C3 defined by 
T (“Ys 2) —, (2¢ ayy — tz; Go (le) yt Oe) 


We find a similar formula for the adjoint T* of T. Note (Problem 7.3) that the 
matrix of T in the usual basis of C3 is 


Ze 0 
[T] = 0 1 —di 
te tt ies 


Recall that the usual basis is orthonormal. Thus by Theorem 13.6, the matrix of T* 
in this basis is the conjugate transpose of [T]: 


Bo Oe raed 
[Ty] = {[-@ 1 14+: 
0 2Br oes 


Accordingly, 
T*(a,y,2) = (2e +2, -ix+y+(1+)z, diy + 32) 


The following theorem summarizes some of the properties of the adjoint. 
Theorem 13.7: Let S and T be linear operators on V and let k € K. Then: 
(i (SOS ee (il), Sey" = rs s* 
(ii) i (KT) * = kr (iv) CCE") Seer 


CHAP. 13] INNER PRODUCT SPACES 285 


ANALOGY BETWEEN A(V) AND C, SPECIAL OPERATORS 


& Let A(V) denote the algebra of all linear operators on a finite dimensional inner product 
space V. The adjoint mapping T T* on A(V)'is quite analogous to the conjugation map- 
ping z+» Zz on the complex field C. To illustrate this analogy we identify in the following 
table certain classes of operators T € A(V) whose behavior under the adjoint map imitates 
the behavior under conjugation of familiar classes of complex numbers. 


Class of Behavior under Class of Behavior under 
complex numbers conjugation operators in A(V) the adjoint map 


Unit circle (\2| = 1) Orthogonal operators (real case) 
Unitary operators (complex case) 


Self-adjoint operators 

Also called: 
symmetric (real case) 
Hermitian (complex case) 


Real axis 


Skew-adjoint operators 

Also called: 
skew-symmetric (real case) 
skew-Hermitian (complex case) 


Imaginary axis 


Positive half axis £4, : 1h ISS 
(0, ©) Positive definite operators Withos nonainenlae 


The analogy between these classes of operators TJ and complex numbers z is reflected in 
the following theorem. 


Theorem 13.8: Let be an eigenvalue of a linear operator J on V. 
(i) If T*=T=1, then |a|=1. 
(ii) If T7* =T, then d is real. 
(iii) If T* = —T, then d is pure imaginary. 
(iv) If T =S*S with S nonsingular, then ) is real and positive. 


We now prove the above theorem. In each case let v be a nonzero eigenvector of T 
belonging to A, that is, T(v) =Av with v #0; hence (v, v) is positive. 


Proof of (i): We show that dd(v, v) = (v, v): 
WW oy = he, Xv) = (Tv), Tv) = (vo, L*T(v)) = Ww, 1(v)) = XU,) 

But (v,v) #0; hence AX=1 and so |A| = 1. 

Proof of (ii): We show that XA(v, v) = A(v, v): 

Av, v) = av, v) = (T(r), v) = (,T*(v)) = sT(r)y = Av) = AY, 0) 

But (v,v) #0; hence =X and so d is real. 

Proof of (iii): We show that dv, v) = —Xv, v): 

Mv, v) = (rv, vy = (T(v),v) = (v,T*(v)) = , —T(v)) = (v,—Av) = —Av,v) 


But (v,v) 40; hence A= —X or \=—A, and so d is pure imaginary. 


286 INNER PRODUCT SPACES [CHAP. 13 


Proof of (iv): Note first that S(v)~0 because S is nonsingular; hence (S(v),S(v)) is 
positive. We show that r(v, v) = (S(v), S(v)): 


Mv, v) = (Av,v) = (T(v),v) = (S*S(v),v) = (S(v), S(r)) 
But (v, v) and (S(v), S(v)) are positive; hence A is positive. 


We remark that all the above operators T commute with their adjoint, that is, 
TT* =T*T. Such operators are called normal operators. 


ORTHOGONAL AND UNITARY OPERATORS 


Let U be a linear operator on a finite dimensional inner product space V. As defined 
above, if : 
U*=U-" — or equivalently UU* = U*0 =—J 
then U is said to be orthogonal or unitary according as the underlying field is real or com- 
plex. The next theorem gives alternate characterizations of these operators. 


Theorem 13.9: The following conditions on an operator U are equivalent: 
(i) U*=U"*, thatis, JU*=U*U =I. 
(ii) U preserves inner products, ie. for every v,wEV, 
(U(v), U(w)) = (v, w) 
(iii) U preserves lengths, i.e. for every v € V, ||U(v)|| = |lv]]. 
Example 13.14: Let T: R?—> R® be the linear operator which 


rotates each vector about the z axis by a fixed 
angle 6: 


(0) 


T(%,y,2) = (#% cos@ —y sin 6, 
x sing + y cos@, z) 
Observe that lengths (distances from the ori- 


gin) are preserved under JT. Thus T is an 
orthogonal operator. x 


Example 13.15: Let V be the Io-space of Example 13.5. Let T:V-V _ be the linear operator de- 
fined by T(a,,do,...) = (0,a@,,@,...). Clearly, T preserves inner products and 
lengths. However, T is not surjective since, for example, (1,0,0,...) does not belong 
to the image of T; hence T is not invertible. Thus we see that Theorem 13.9 is not 
valid for spaces of infinite dimension. 

An isomorphism from one inner product space into another is a bijective mapping 
which preserves the three basic operations of an inner product space: vector addition, 
scalar multiplication, and inner products. Thus the above mappings (orthogonal and 
unitary) may also be characterized as the isomorphisms of V into itself. Note that such a 
mapping U also preserves distances, since 


|U(v) — U(w)|| = ||U(v— w)|| = |lw— w| 


and so U is also called an isometry. 


ORTHOGONAL AND UNITARY MATRICES 


Let U be a linear operator on an inner product space V. By Theorem 13.6 we obtain the 
following result when the base field K is complex. 


Theorem 13.10A: A matrix A with complex entries represents a unitary operator U 
(relative to an orthonormal basis) if and only if A* = A7}. 


On the other hand, if the base field K is real then A* = A‘; hence we have the follow- 
ing corresponding theorem for real inner product spaces. ; 


CHAP. 13] INNER PRODUCT SPACES 287 


Theorem 13.10B: <A matrix A with real entries represents an orthogonal operator U 
(relative to an orthonormal basis) if and only if At=A7. 


The above theorems motivate the following definitions. 


Definition: A complex matrix A for which A* = A~1,.or equivalently AA* = A*A =T, 
is called a unitary matrix. 


Definition: A real matrix A for which At=A™!, or equivalently AAt=A'tA=I, is 
called an orthogonal matrix. 


Observe that a unitary matrix with real entries is orthogonal. 


a, a 
Example 13.16: Suppose A = ( : *) is a unitary matrix. Then AA*=TJ and hence 


b; by 
AA* = ay as Gy by 23 |a,|2 4F |ao|2 a,b, SIF Ayb5 a it 0 A’ I 
b, be /\ dy be Gb, + Goby |b,|? + |bo|? Olel 
Thus 
lay|? + lao|? = 1, 6,2 + |b]? = 1 and ab; + agb. = 0 


Accordingly, the rows of A form an orthonormal set. Similarly, A*A =I forces 
the columns of A to form an orthonormal set. 


The result in the above example holds true in general; namely, 


Theorem 13.11: The following conditions for a matrix A are equivalent: 
(i) A is unitary (orthogonal). 
(ii) The rows of A form an orthonormal set. 
(ili) The columns of A form an orthonormal set. 


Example 13.17: The matrix A representing the rotation T in Example 13.14 relative to the usual 
basis of R® is 


cos@ —sing 0 
At sin 6 cos 6 0 
0 0 1 


As expected, the rows and the columns of A each form an orthonormal set; that is, 
A is an orthogonal matrix. 


CHANGE OF ORTHONORMAL BASIS 


In view of the special role of orthonormal bases in the theory of inner product spaces, 
we are naturally interested in the properties of the transition matrix from one such basis 
to another. The following theorem applies. 


Theorem 13.12: Let {é1,...,é@} be an orthonormal basis of an inner product space V. 
Then the transition matrix from {e;} into another orthonormal basis is 
unitary (orthogonal). Conversely, if P= (aij) is a unitary (orthogonal) 
matrix, then the following is an orthonormal basis: 


.en= 14€1 + Aig + °° * + Ani€n: = as ie yh} 


Recall that matrices A and B representing the same linear operator T are similar, i.e. 
B=P-'!AP where P is the (nonsingular) transition matrix. On the other hand, if V is 
an inner product space, we are usually interested in the case when P is unitary (or orthog- 
onal) as suggested by the above theorem. (Recall that P is unitary if Pt= Po vand Ph is 
orthogonal if P‘=P-'.) This leads to the following definition. 


288 INNER PRODUCT SPACES |CHAP, 13 


Definition: Complex matrices A and B are wnitarily equivalent if there is a unitary matrix 
P for which B = P*AP. Analogously, real matrices A and B are orthogonally 
equivalent if there is an orthogonal matrix P for which B= P‘AP. 


Observe that orthogonally equivalent matrices are necessarily congruent (see page 262). 


POSITIVE OPERATORS 


Let P be a linear operator on an inner product space V. P is said to be positive (or: 


semi-definite) if 
ee P=S*S_ for-some operator S 


and is said to be positive definite if S is also nonsingular. The next theorems give alternate 
characterizations of these operators. 


Theorem 13.13A: The following conditions on an operator P are equivalent: 
(i) P=T? for some self-adjoint operator T. 
(ii) P = S*S for some operator S. 
(iii) P is self-adjoint and (P(u),u)=0 for every we V. 


The corresponding theorem for positive definite operators is 


Theorem 13.13B: The following conditions on an operator P are equivalent: 
(i) P=T? for some nonsingular self-adjoint operator T. 
(ii) P= S*S for some nonsingular operator S. 
(iii) P is self-adjoint and (P(u),u) >0 for every u~0 in V. 


DIAGONALIZATION AND CANONICAL FORMS IN EUCLIDEAN SPACES 


Let T be a linear operator on a finite dimensional inner product space V over K. Rep- 
resenting T by a diagonal matrix depends upon the eigenvectors and eigenvalues of T, 
and hence upon the roots of the characteristic polynomial A(t) of T (Theorem 9.6). Now 
A(t) always factors into linear polynomials over the complex field C, but may not have any 
linear polynomials over the real field R. Thus the situation for Euclidean spaces (where 
K =R) is inherently different than that for unitary spaces (where K =C); hence we treat 
them separately. We investigate Euclidean spaces below, and unitary spaces in the next 
section. 


Theorem 13.14: Let T be a symmetric (self-adjoint) operator on a real finite dimensional 
inner product space V. Then there exists an orthonormal basis of V 
consisting of eigenvectors of 7; that is, T can be represented by a 
diagonal matrix relative to an orthonormal basis. 


We give the corresponding statement for matrices. 


Alternate Form of Theorem 13.14: Let A be a real symmetric matrix. Then there exists 
an orthogonal matrix P such that B = P-1AP = PtAP 
is diagonal. 


We can choose the columns of the above matrix P to be normalized orthogonal eigen- 
vectors of A; then the diagonal entries of B are the corresponding eigenvalues. 


CHAP. 13] INNER PRODUCT SPACES 289 


22 
srg oN) 
The characteristic polynomial A(t) of A is 


Example 13.18: Let A = ( ) . We find an orthogonal matrix P such that PtAP is diagonal. 


2 


AW = Urs A\ = a Me 


| (G6) (El) 
The eigenvalues of A are 6 and 1. Substitute t=6 into the matrix tl—A to 
obtain the corresponding homogeneous system of linear equations 

Ay 2 = 05s 20 yi i= 0 


A nonzero solution is v, = (1,—2). Next substitute t=1 into the matrix tl —A 
to find the corresponding homogeneous system 


=e ae Ay = A Bp ie = 


A nonzero solution is (2,1). As expected by Problem 13.31, v; and vy are orthogonal. 
Normalize v, and v, to obtain the orthonormal basis 


{uy = (A/V, —2/V5), up = (2/V5, 1/V5)} 
Finally let P be the matrix whose columns are wu, and uy respectively. Then 
ae ( WV5 2/5 
2/5 1/5 


As expected, the diagonal entries of P‘AP are the eigenvalues corresponding to the 
columns of P. 


en) 
and | BS Bee cee NAY 2 (3 ) 


We observe that the matrix B=P~!AP=FP'‘AP is also congruent to A. Now if q is 
a real quadratic form represented by the matrix A, then the above method can be used 
to diagonalize gq under an orthogonal change of coordinates. This is illustrated in the 


next example. 


Example 13.19: Find an orthogonal transformation of coordinates which diagonalizes the quadratic 
form q(x, y) = 2%2— 4ay + 5y?. 


2 -—2 
The symmetric matrix representing q is A = . In the preceding 
é ; =P SS 
example we obtained the orthogonal matrix 


1 2 
V5 25 for which. PAP” = . °) 
—2/V5 1/5 


(Here 6 and 1 are the eigenvalues of A.) Thus the required orthogonal transforma- 
tion of coordinates is 


- a! a = a /V5 + 2y'V5 
(*) = p(;,) that is, ee ont 5 + y'VB 
Under this change of coordinates gq is transformed into the diagonal form 
gay), = 6x ay” 
Note that the diagonal entries of q are the eigenvalues of A. 


An orthogonal operator T need not be symmetric, and so it may not be represented by 
a diagonal matrix relative to an orthonormal basis. However, such an operator T does have 
a simple canonical representation, as described in the next theorem. 


Theorem 13.15: Let 7 be an orthogonal operator on a real inner product space V. Then 
there is an orthonormal basis with respect to which T has the following 


form: 


290 INNER PRODUCT SPACES [CHAP. 13 


1 | 
1 | 
| 
tal 
eS a SY ates as : 
| —1 ! 
| =| | 
| | 
| —1 | 
l =e | yer 7a 
| COS 6, sin 6; 
| sin 6; sin Oh Senge 
| Cos 6, —sin 6, 
| sin 6, COs 6, 


The reader may recognize the above 2 by 2 diagonal blocks as representing rotations in 
the corresponding two-dimensional subspaces. 


DIAGONALIZATION AND CANONICAL FORMS IN UNITARY SPACES 


We now present the fundamental diagonalization theorem for complex inner product 
spaces, i.e. for unitary spaces. Recall that an operator T is said to be normal if it com- 
mutes with its adjoint, i. if TT* = T*T. Analogously, a complex matrix A is said to be 
normal if it commutes with its conjugate transpose, i.e. if AA* = A*A. 


1 1 
Bayo 7a 


Se ae ens 1 Ne 20 3 Bt 
af G aya ae 3+ 81° 14 

1 -i at il 2 Sa 
AtA = == 

Gj ae on) Cee 14 ) 


Thus A is a normal matrix. 


Example 13.20: Let A = ( iF Then 


The following theorem applies. 


Theorem 13.16: Let T be a normal operator on a complex finite dimensional inner product 
space V. Then there exists an orthonormal basis of V consisting of 
eigenvectors of 7; that is, T can be represented by a diagonal matrix 
relative to an orthonormal basis. 


We give the corresponding statement for matrices. 


Alternate Form of Theorem 13.16: Let A be a normal matrix. Then there exists a uni- 
tary matrix P such that B= P-1AP=P*AP ig diagonal. 


The next theorem shows that even non-normal operators on unitary spaces have a 
relatively simple form. 


Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional inner 
product space V. Then T can be represented by a triangular matrix 
relative to an orthonormal basis of V. 


CHAP. 13] INNER PRODUCT SPACES 291 


Alternate Form of Theorem 13.17: Let A be an arbitrary complex matrix. Then there 
exists a unitary matrix P such that B = P-1AP = P*AP ig triangular. 


SPECTRAL THEOREM 


The Spectral Theorem is a reformulation of the diagonalization Theorems 13.14 and 13.16. 


Theorem 13.18 (Spectral Theorem): Let 7 be a normal (symmetric) operator on a com- 
plex (real) finite dimensional inner product space V. Then there exist 
orthogonal projections Hi,...,E, on V and scalars \4,...,, such that 


(i) Ap = M14 + dros Se Ora Nl. 
(ii) h,i+Eot+t---+H, =I] 
(iii) B.E;=0 for ixj. 


The next example shows the relationship between a diagonal matrix representation and 
the corresponding orthogonal projections. 


Example 13.21: Consider a diagonal matrix, say A = : Let 


et ; 2 
The reader can verify that the H; are projections, ie. H; = E;, and that 


Getta 2B SEL OR (Gi) By +H, hy = J, ~ (iil) BE}=0 fora 


Solved Problems 


INNER PRODUCTS 
13.1. Verify the relation (u, avi t+ bv2) = au, v1) + b(u, V2). 
Using [J], [Z,] and then [J,], we find 
(u, av, + bv9) = (a0, + bv, U) = a(v,,u) + B(v5, u) 


= (01, U) + BWs,%) = (u,v) + bu, v9) 


13.2. Verify that the following is an inner product in R’: 
(U,V) = HrYi — UrYy2 — LoYy1 + Bx2y2, where wu = (X1,%2), v= (Ys, Ye). 


Method 1. 
We verify the three axioms of an inner product. Letting w = (%, Zo), we find 


au t+ bw = ala, %) + b(zy,%) = (aay + bz, axe + b2q) 


292 INNER PRODUCT SPACES [CHAP. 13 


Thus 
(au + bw, v) = (ax, +. bz, aX», AF bz»), (v1; Y2)) 


= (ax, + bz,)y, — (ax, + b2,)Yo — (axy + b2a)y, + 3(axg + bz2)Yo 
= al% yyy — Ye — yyy + BxQY2) + (zy) — %1Y2 — Yi + 32pYo) 
= alu,v) + bw, v) 
and so axiom [{J,| is satisfied. Also, 
(0, U) = Yy%y — Yy%_ — Yo%, + BYo%e = LY, — LyYo — Loy, + 8xQyg = (U,V) 
and axiom [J] is satisfied. Finally, 
(uu) = «= teas 325 = ay — Qarya%9 + oot: Que = (xy = a2)? + 205 = 0 
Also, (u,u)=0 if and only if «,=0, r,.=0, i. u=0. Hence the last axiom [J3] is satisfied. 


Method 2. 


We argue via matrices. That is, we can write (u,v) in matrix notation: 


,v) = wAy = te e)(_ | a) 
2 


and so [I,] holds. Since A is symmetric, [I] holds. Thus we need only show that A is positive 
definite. Applying the elementary row operation R,—>R,+R, and then the corresponding ele- 


LAO 
mentary column operation C,—>C,+C,, we transform A into diagonal form ( >): Thus A 
is positive definite and [J3] holds. 0 


13.3. Find the norm of v = (3,4) € R? with respect to: 
(i) the usual inner product, (ii) the inner product in Problem 13.2. 
(i) {lel |? 


(ii) ||»||? 


((3, 4), (8,4)) = 9 + 16 = 25; hence ||v|| = 5. 


(v, v) 


(v,v) = (8,4), (8,4)) = 9 — 12 — 12 + 48 = 83; hence ||v|| = V33. 


13.4. Normalize each of the following vectors in Euclidean space R?: 
(i) w= (2,1,—1), (ii) v= 6,4, —4). 


(i) Note (u,x) is the sum of the squares of the entries of u; that is, (u,u) = 224124 (Ge == @. 
Hence divide u by ||u|| = V(u,u) = V6 to obtain the required unit vector: 


u/|\ul| = (2/76, 1/V6, -1/V6 ) 


(ii) First multiply v by 12 to “clear” of fractions: 12v = (6,8,—3). We have (12v,12v) = 
62 + 82 + (—3)2 = 109. Then the required unit vector is 


12v/||120|| = (6/V109, 8/V109, —3/1/109 ) 


13.5. Let V be the vector space of polynomials with inner product given by (f,9) = 
1 
J, f(t) g(t)dt. Let f(t)=t+2 and g(t)=t—2t—3. Find (i) (f,g) and (ii) ||f}. 


1 
eS f (t-+2)(2—2t—38)dt = | #74 78/2 —6e] a 
0 


0 


it 
ji) GA) = f (¢+ 2)(¢+ 2) dt = .19/3 -. and’ |\fil =-WaFy = V1978 
0 


CHAP. 13] INNER PRODUCT SPACES 293 


13.6. Prove Theorem 13.1 (Cauchy-Schwarz): |(u, v)| = |lu]| ||>]]. 


: Epes v =0, the inequality reduces to 0 =0 and hence is valid. Now suppose v#0. Using 
ze = |z\2 (for any complex number z) and (v, u) = (u,v), Wwe expand ||u—(u,v)tv||2 = 0 where t 
is any real value: 
||u — (uw, v)to||2 = (u— (u, v)tv, u — (u, v)tv) 

(U,U) — (U, v)t(u,v) — (u,v)t(v,u) + (u, v)(u, v)t2(v, v) 
||ee]|* — 2¢ (a, v)|2 + |(u, v)|? 2 ||o]|2 


‘ . Ica, v9)?” 
Set t¢=1/||v||? to find 0 = ||ul|2 ~ Tye? from which |(u, v)|? = ||u||? |\v||2. Taking the square 


root of both sides, we obtain the required inequality. 


13.7. Prove that the norm in an inner product space satisfies the following axioms: 
[Ni]: ||v]| = 0; and ||v||=0 if and only if v=0. 
[Ne]: ||ko]] = |k| Io]. 
[Na]: [le + ol] = ||ul] + |lo). 
By [J3], (v,v) = 0; hence ||v|| = Vw, v) = 0. Furthermore, ||v|| = 0 if and only if (v,v) = 0, 
and this holds if and only if v = 0. Thus [Nj] is valid. 
We find ||kv||? = (kv, kv) = kk(v, v) = |k|2 ||v||2.. Taking the square root of both sides gives [N»]. 
Using the Cauchy-Schwarz inequality, we obtain 
Weest OHA ty we Ey) SS im, WY + (a, v) + GY) E(w, v) 
= |lull? + 2ifeell loll + [lol = (lel] + |folp? 
Taking the square root of both sides yields [Ng]. 
Remark: [N3] is frequently called the triangle inequality because 


if we view uw+v as the side of the triangle formed with u and v age 

(as illustrated on the right), then [N3] states that the length of one e 

side of a triangle is less than or equal to the sum of the lengths 

of the other two sides. u 
ORTHOGONALITY 


13.8. Show that if wu is orthogonal to v, then every scalar multiple of u is also orthogonal 
to v. Find a unit vector orthogonal to v1=(1,1,2) and. v2 = (0,1,3) in R*. 
If (u,v) =0 then (ku,v) = k(u,v) =k-0=0, as required. Let w= (x,y,z). We want 
0 = wv) = e@tyt2z and 0 = (w,%) =.y Tt 3z 
Thus we obtain the homogeneous system 
etyt2z2=0, yt3ze = 0 
Set z=1 to find y=—3 and «=1; then w=(1,—3,1). Normalize w to obtain the required 
unit vector w’ orthogonal to v; and vg: w! = w/||w|| = (1/11, —3/V11, 1/11). 


13.9. Let W be the subspace of R® spanned by u = (1,2,3,—1,2) and v= (Qr4) 752.20) 
Find a basis of the orthogonal complement W of W. 
We seek all vectors w = (a,y,2,8,t) such that 
Qua) =" wer Qyetrae 8a 2b =. 0 
ae . (w,v) = 2e+4y+Te+2s-— ¢ = 0 
Eliminating x from the second equation, we find the equivalent system 

e+ 2y+3z2-— s+2t = 0 
ae O43 lye 0 


The free variables are y,s and t. Set y=—1, s=0, t=0 to obtain the solution w, = (2, —1, 0, 0, 0). 

Set y =0, s=1, t= 0 -to'find the solution ws = (18,0,—4,1,0). Set y= 0, s= 0, ¢=1 to obtain 
é al 

the solution w3 = (—17,0,5,0,1). The set {W1, Wp, W3} is a basis of W. 


\| 


294 


13.10. 


13.11. 


13.12. 


INNER PRODUCT SPACES [CHAP. 13 


Find an orthonormal basis of the subspace W of C® spanned by v1 =(1,7,0) and 
ve = 1, 2;1—2). 


Apply the Gram-Schmidt orthogonalization process. First normalize v,;. We find 
lol? = eye) = Ted 4 49-7) 0400 = 2 end so. aon =P} 

Thus wu, = »4/\|v4|| = (/V2, i/v2, 0). 

To form Wy = Vo — (Vo, %4)Uy, first compute 

(Vo, U4) = (1, 2,1—%4), (1/V2, i/V2, 0)) = 1/V2— 2/2 = (1-2%)/V2 
: ih Is he i fe Ne ee ey a 

Then Os = a UG) a (Je 78°) = ( : ipa i) 
Next normalize we or, equivalently, 2w, = (1+ 2i,2—%,2— 27). We have 


\|2w,||2 = (Qw,,2w,) = (1+ 21 — 224) + (2-H(2Q+A) + (2—21)(2+ 21) = 18 


and ||2w,|| = V¥18. Thus the required orthonormal basis of W is 


Prove Lemma 138.3: An orthonormal set {w,...,u,} is linearly independent and, for 
any v © V, the vector 
W = V=(V, U1) — (U5 U2)Ul2 = > > > —(V, Ue Ur 


is orthogonal to each of the wi. 


Suppose a,;u,; + ++: + a,u, = 0. Taking the inner product of both sides with respect to u,, 
0 = (0,uy) = (Ayu t +++ + a,U,, Uy) 
(Uy, Uy) + Ag(Ug, Uy) + -e+ + (Uy, Uy) 
= Gye Seda i0) aoe neat 
or a, =0. Similarly, for i= 2,...,7, 
0 = (0,4) = (yu,+°+:+4,u,, U;) 
=  Gi(Uy, Uj) a es a OU, Ue = SS AK, t;) | — 

Accordingly, {w,,...,%,} is linearly independent. 


It remains to show that w is orthogonal to each of the u;. Taking the inner product of w with 
respect to uw, 


(W,Uy) = (V, Uy) — (UV, Uy)(Uy, Uy) — (UV, Uph(Ug, Uy) — +5 — (Y,U,p)(Up, Uy) 
= (0, Uy) — (0, U4)°1 — (,U2)*0 — +>- = (v,u,)20 = 0 
That is, w is orthogonal to u,. Similarly, for i= 2,...,7, 
(W, Uj) = (UV, Uj) — (V, Uy)(Uy, Uj) — +++ — (V, Wi)(Uj Uj) — +++ — (0, Uy )(Uy, U4) = 0 
Thus w is orthogonal to u; for i=1,...,7r, as claimed. 


Let W be a subspace of an inner product space V. Show that there is an orthonormal 
basis of W which is part of an orthonormal basis of V. 


We choose a basis {v;,...,v,} of W and extend it to a basis {v,,...,v,} of V. We then apply 
the Gram-Schmidt orthogonalization process to {v,.. -»Un} to obtain an orthonormal basis 
{uy,...,Un} of V where, for i=1,...,n, uj = av, + -°* + a,v; Thus %4,;...,u,€W and there- 


fore {u,,...,u,} is an orthonormal basis of W. 


CHAP. 13] INNER PRODUCT SPACES 295 


13.13. Prove Theorem 13.2: Let W bea subspace of V; then V=WOW?. 


By Problem 18.12 there exists an orthonormal basis {u,,...,u,} of W which is part of an ortho- 
normal basis {u,,...,%,} of V. Since {Uy,...,Un} is orthonormal, u,4+,,.. Wire Wet we We 
VS Gu, +*s* +a, Where ayty+ +++ + a,u, 6 W, Oe ytya, t+??? sig ntl wt 


Accordingly, V = W+ Wo: 
On the other hand, if w€ Wn var then (w,w)=0. This yields w=0; hence Wn wi= {0}. 
The two conditions, V = W+ W+ and Wowt = {0}, give the desired result V=W®@® wt. 


Note that we have proved the theorem only for the case that V has finite dimension; we remark 
that the theorem also holds for spaces of arbitrary dimension. 


13.14. Let W be a subspace of W. Show that Wc Ve and that W=W-+* when V 
has finite dimension. 


Let we W. Then (w,v)=0 for every v& wt; hence we Wt, Accordingly, wewt-. 


Now suppose V has finite dimension. By Theorem 13.2, V = W@ wt and, also, V = 
w+ @wt+. Hence 


dim W= dimV—dimW- and) dimwt+ = dim V — dim Ww 


This yields dim W = dim wt, But Wowtt by the above; hence W = Wace as required. 


13.15. Let {é1, ..., én} be an orthonormal basis of V. Prove: 
(i) forany we V, w= (u, e1)e1 + (U, €2)e2 + > ++ + (Ul, ener; 
(ii) (a1€1 +--+ +Qn€n, b1e1 + +++ +On€n) = Qibi + debe + +++ + Onbn; 
(iii) for any u,v EV, (u,v) = (u, e1)(v, €1) + +++ + (U, en), En); 
(iv) if T: V>V is linear, then (T(e;), e:) is the ij-entry of the matrix A representing 
T in the given basis {é;}. 


(i) Suppose u=k,e,+kyeg +--+: +k,e,. Taking the inner product of wu with e,, 


(a, é;) = (hye, + koeg t+ +++ + hyen, €1) 
= ky (ey,&1) + kegleg,€1) + ov + Kenn; €1) 
eat ely -rake Oud rr ok, 0) 44 hy 
Similarly, for 7 = 2;....;%, 
(6) = yey es kept “+ ey, €;) 
SRC epet are at Ages eed 127 kn l€ny &) 
See Outs oh cect a eae kee Nesrie arti, e 0) ioe they 
Substituting (w,e;) for k; in the equation wu =k,e,; +++: tk,e,, we obtain the desired result. 
n n n 
(ii) We have ($ ai, > 7) me; = a,b ;(€4, €;) 
i=1 jis aa 


But (e,¢;)) =0 for i#j, and (¢,e;) =1 for += 4; hence, as required, 
n n a 5 ¥ : 3 
(3 A,Ci, = sei) = 1 ao, = Gb, + Gabe +o + Anbn 
Gee j=1 i= 


GieBy @, 0 =. @eye; F-"* + enlen> and v=. (, e)e1 Ete + Mendon 


Then by (ii), (u,v) = (te, €1)(0, &1) + (tt, €p)(V, ey) + ++ + (tly Cn), en) 


296 INNER PRODUCT SPACES [CHAP. 13 
(iv) By (i), 
T(e;) = (T(e1), €1)e1 + (T(e1), €g)€g + °°? + (T(€1), enn 
T (ey) = (T(e), €1)ey “in (T (es), €o)€9 ai ee Wr (T (eg), Enden 
T(e,) = (T(€n), €3)ey a (T (en); €n)€2g Slee aaa (T(€n); Cnn 
The matrix A representing 7 in the basis {e;} is the transpose of the above matrix of co- 
efficients; hence the ij-entry of A is (T(e;), é). 
ADJOINTS 


13.16. Let T be the linear operator on C* defined by 


13.17. 


13.18. 


T(x, y, 2) = (2x +(1—i)y, (8 + 2t)x — diz, 2ix + (4 — 3i)y — 82) 
Find T* (x, y, 2). 
First find the matrix A representing T in the usual basis of C3 (see Problem 7.3): 


2 bore 0 
A 34 26 0 40) 
21 NE WEY) 2) ==e) 


Form the conjugate transpose A* of A: 
2 BS Pa 
A 1 Spx) 0 4+ 31 
0 4i —3 


Thus 
LA(e, Ys 2) = (2e5- (8 21) ies (ea) Aa Olle 4 oe) 


Prove Theorem 13.5: Let ¢ be a linear functional on a finite dimensional inner 
product space V. Then there exists a unique u€V_ such that ¢(v)=(v,u) for 
every vEV. 


Let {e,,...,¢,} be an orthonormal basis of V. Set 
Ch p(e1)ey + H(€n)€y ste eee P(En)en 


Let @ be the linear functional on V defined by “(v) = (v,u), for every v © V. Then for i= eee 


’ 


u(e)) = (@,U) = (ej, g(ey)ey t+ +++ + G(En)en) = (Ee) 
Since « and ¢ agree on each basis vector, C= g. 


Now suppose w’ is another vector in V for which ¢(v) = (v,u’) for every vu@V. Then 
(v,u) = (v,u’) or (v, u—u’) = 0. In particular this is true for v =u—w’ and so (u—w', u— 10 
This yields u—u'’=0 and w=w’. Thus such a vector w is unique as claimed. 


Prove Theorem 13.6: Let T be a linear operator on a finite dimensional inner product 
space V. Then there exists a unique linear operator T* on V such that (T (ua); 0) = 
(u,T*(v)), for every u,v €V. Moreover, if A is the matrix representing T in an 


orthonormal basis {e:} of V, then the conjugate transpose A* of A is the matrix rep- 
resenting T* in {e;}. 


We first define the mapping 7*. Let v be an arbitrary but fixed element of V. The. map 
ut (T(u),v) is a linear functional on V. Hence by Theorem 13.5 there exists a unique element 
v’ €V_ such that (T(u),v) = (u,v’) for every u€V. We define T*V > V by 2* (o)i's “Then 
(T(u), v) = (u, T*(v)) for every u,v EV. 


CHAP. 


13.19. 


13.20. 


13.21. 


13.22. 


13] INNER PRODUCT SPACES 297 


We next show that T* is linear. For any u,v;&V, and any a,bE K, 
(u, T*(av, + bve)) = (T(u), av, + bug) = a(T(u), vy) + B(T(u), Vo) 
= au, T*(v,)) + Blu, T*(v)) = (u, aT*(v,) + bT*(v9)) 
But this is true for every u€V:; hence T* (av, + bvs) = aT*(v,) + bT* (vy). Thus T* is linear. 


; By Problem 13.15(iv), the matrices A = (a,j) and B= (b;;) representing T and T* respectively 
in the basis {e;} are given by a,; = (T(e;),e;) and b;; = (T*(e;), e;). Hence 


Oye Eee = Ment * (6) = VT (eynie) y= ag 


ji 


Thus B= A*, as claimed. 


Prove Theorem 13.7: Let S and T be linear operators on a finite dimensional inner 
product space V and let k © K. Then: 


(i) (S+T)*=S*+T* (iii) (ST)* = T*S* 
(i) Any ers (ing) oT 
(i) For any u,vEV, 
(S + T)(u), ») = (S(u) + T(u), v) = (S(u), v) + (Tw), v) = (u, S*(v)) + (u, T*(v)) 
= (Us S*(v) a I (v)) = (ue, (S* 4-1 *)(2)) 
The uniqueness of the adjoint implies (S + T)* = S* + 7*, 
(ii) For any u,v EV, 
(kT)(u), v) = (kT(u), v) = k(T(u), v) = ku, T*(v)) = (u, kT*(v)) = (u, (kT*)(v)) 
The uniqueness of the adjoint implies (kT)* = kT*. 
(iii) For any u,v € V, 
(ST)(u), v) = S(T(u)), vy = (Tu), S*(v)) = (u, T*(S*(v))) = (u, (T*S*)(0)) 
The uniqueness of the adjoint implies (ST)* = T*S*. 
(iv) For any u,vEV, CEQ) Oy = MO, IEG) Se GN), ahs ae, SK) 
The uniqueness of the adjoint implies (7*)* = T. 


Show that: (1) I* = 7; (i) 0* = 0; (iii) if T is invertible, then (7 ')* = T*-". 
(i) For every u,v EV, (U(u),v) = (u,v) = (u,I(v)); hence I* = I. 

(ii) For every u,v EV, (0(u),v) = (0, v) = 0 = (u, 0) = (u, 0(v)); hence 0* = 0. 

(Cid) el ea eee Le (ee) Pe ee nence (li a!) i= ine, 


Let T be a linear operator on V, and let W be a T-invariant subspace of V. Show 
that W~ is invariant under 7*. 

Let ue W-. If we W, then T(w) € W and so (w, T*(u)) = (T(w),u) = 0. Thus T*(u) © wt 
since it is orthogonal to every w€W. Hence W is invariant under T*. 


Let 7 be a linear operator on V. Show that each of the following conditions implies 

T=0: 

(i) (T(u),v)=0 for every u,v € V; 

(ii) V is a complex space, and (T(u),u) = 0 for every uc V; 

(iii) T is self-adjoint and (T(u),u)=0 for every we V. 
Give an example of an operator T on a real space V for which (T(u),u) = 90 for 
every “7G V but T+ 0. 

fi) Set v=T(u). Then (T(u), T(u)) = 0 and hence J(u) =0, for every wEV. Accordingly, 
Tir) 


298 


INNER PRODUCT SPACES [CHAP. 13 


(ii) By hypothesis, (7T(v+w),v+w)=0 for any v,we V. Expanding and setting (T(v),v) = 0 
and (T(w),w) = 0,: 


(T(v), w) + (T(w), v) = 0 (1) 
Note w is arbitrary in (1). Substituting iw for w, and using (T(v),@w) = i(T(v), w) = 
—i(T(v),w) and (T(iw),v) = (iT(w), v) = uU(T(w), v), - 


—i(T(v), w) + uT(w), v) = 0 


Dividing through by i and adding to (1), we obtain (T(w),v)=0 for any v,weV. By (i), 
divas) 


(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. 
Expanding (7T(v+w),v+w)=0, we again obtain (1). Since T is self-adjoint and since it is 
a real space, we have (7(w),v) = (w, T(v)) = (T(v),w). Substituting this into (1), we obtain 
(T(v),w) = 0 for any v,weEV. By (i), T=0. 


For our example, consider the linear operator T on R? defined by T(«,y) = (y,—x). Then 
(T(u),u) = 0 for every ue V, but T#0. 


ORTHOGONAL AND UNITARY OPERATORS AND MATRICES 


13.23. 


13.24. 


13.25. 


Prove Theorem 13.9: The following conditions on an operator U are equivalent: 
(i) U* = U-; (ii) (U(v), U(w)) = (v, w), for every v,w EV; (iii) ||U(v)|| = ||o||, for 
every vEV. 


Suppose (i) holds. Then, for every v,wEV, 
(U(v), U(w)) = (v, U*U(w)) = (v,I(w)) = (, w) 
Thus (i) implies (ii). Now if (ii) holds, then 
W@|| = VU), Te), = Vo,2) = |lol| 
Hence (ii) implies (iii). It remains to show that (iii) implies (i). 
Suppose (iii) holds. Then for every v € JV, 
- (U*U(), ») = (Ue), Ulv)) = (, vy = Ue), 2) 


Hence ((U*U—I)(v),v) = 0 for every v€& V. But U*U —I is self-adjoint (Prove!); then by Prob- 
Iem 13.22 we have. U*U—I=0 and so U*U=T. Thus U*=>U- as claimed. 


Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant 
under U. Show that W~ is also invariant under U. 


Since U is nonsingular, U(W) = W; that is, for any w€W there exists w’ © W such that 
U(w’) = w. Nowlet v€@W. Then for any w€ W, 


(U(r), w) = (Uv), U(w')y = (v, w’) = 0 


Thus U(v) belongs to wi. Therefore wt is invariant under U. 


Let A be a matrix with rows FR; and columns C;. Show that: (i) the ij-entry of AA* 
is (Ri, Rj); (ii) the 1j-entry of A*A is (Cj, C). 
If A = (a,;), then A* = (b;) where b;; = @j. Thus AA* = (c,;) where 


phair =, CPx = = =, Vibe = AA + Ayia + *-* + Aindjn 
7 ((Qit, ST Smesy: Qin), (a1, Seal Oh9) jn) = (R,, R)) 
as required. Also, A*A = (d;;) where 
n n 
dy = =, Dinky = = 2 Uji Ue, = My jAyy + Agi, + +++ + OnjOng 


(a4; > lol alg Gn)» (a4; Slog @y;)) = (C;, C;) 


CHAP. 13} INNER PRODUCT SPACES 299 


13.26. Prove Theorem 13.11: The following conditions for a matrix A are equivalent: 


13.27. 


13.28. 


(i) A is unitary (orthogonal). (ii) The rows of A form an orthonormal set. (iii) The 
columns of A form an orthonormal set. 


Let R; and C; denote the rows and columns of A, respectively. By the ee problem, 


ArAI = (c,;) achere ej; =X; hj). Thus: AA* =7 > if andl only if (R;,R;) = 5;; That is, (i) is 
equivalent to (ii). 


Also, by the preceding problem, A*A = (d ij) Where dj; = (C;,C;). Thus A*A =I if and only 
if (Cj, C;) = 8;;. That is, (i) is equivalent to (iii). 


Remark: Since (ii) and (ili) are equivalent, A is unitary (orthogonal) if and only if the transpose 
of A is unitary (orthogonal). 


Find an orthogonal matrix A whose first row is w= (1/8, 2/8, 2/8). 
First find a nonzero vector ws, = (x,y,z) which is orthogonal to U1, le. for which 
OR AKC 00>) Ly Oerer yy Oa uae Om —— OR Ore ea acy tee = 0 
One such solution is w,=(0,1,—1). Normalize wy. to obtain the second row of A, i.e. 
Ug = (0, 1/72, -1/V2). 
Next find a nonzero vector Ws = (x,y,z) which is orthogonal to both uw, and wo, ie. for which 
OS Ohya = G3 se Pays se Yas Oi WR ama Pps Be = AL 
0 = Cp. wey, = y/V2— 2/¥2 = 0 or Y= eB = 
Set z=-—1 and find the solution ws; = (4,—1,—1). Normalize ws and obtain the third row of A, 
ie. wg = (4/718, —1/V18, —1/V18). Thus 
1/3 2/3 2/3 
Al.= 0 V2 1/2 
4/32 —1/3V2 —1/3V2 


We emphasize that the above matrix A is not unique. 


Prove Theorem 13.12: Let {é1,...,én} be an orthonormal basis of an inner product 
space V. Then the transition matrix from {e;} into another orthonormal basis is 
unitary (orthogonal). Conversely, if P = (ai) is a unitary (orthogonal) matrix, then 
the following is an orthonormal basis: ‘ 
{ef = aye1 axes + °*++Gniér: t= 1,...,2} 
Suppose {f;} is another orthonormal basis and suppose ; 

fy, = bye + Bigeg T +++ + Dien, T=1,...,0 (1) 
By Problem 13.15 and since {f,} is orthonormal, 

63 = Fat = biybj, + bizdja sat rca Bindjn (2) 
Let B = (b;;) be the matrix of coefficients in (1). (Then Bt is ey transition matrix from {e;} to 


{f}.) By Probie 13.25, BB* =‘(e¢,;) where ¢; = = 6,05) + Bin dig + --* + Din Bins By (2), ¢ij = 83 
and therefore BB* =I. Accordingly B, and fers Bt, are unitary. 


It remains to prove that {e/} is orthonormal. By Problem 13.15, 
(€{,€}) = Qyi@qj + GoiGaj + °° + Onin = (C;, Cj) 


where C; denotes the ith column of the unitary, (orthogonal) matrix P = (a,;). By Theorem 13.11, 
the Polurins of P are orthonormal; hence (e/, os (C;, Cj) = 8;;. Thus {e/} is an orthonormal basis. 


13.29. Suppose A is orthogonal. Show that det(A) =1 or —1. 


Since A is orthogonal, AAt =I. Using |A| = |A'|, 
= |I| = |AAt| = |A||At| = [AP 
Therefore |A| =1 or —1. 


300 


INNER PRODUCT SPACES [CHAP. 13 


13.30. Show that every 2 by 2 orthogonal matrix A for which det(A) =1 is of the form 


re —sin 6 


: for some real number 0. 
sin 6 cos 0 


Suppose A = (¢ os Since A is orthogonal, its rows form an orthonormal set; hence 
c 


OP ay GEG ile exe ae (eh a, SI 
The last equation follows from det(A) =1. We consider separately the cases a = 0 and a#0. 


If a=0, the first equation gives b2=1 and therefore b = +1. Then the fourth equation 
gives ¢ = —b = ¥1, and the second equation yields 1+ d* = 1 or d=0. Thus 


a Opn Ont 
An & a or ieey 
The first alternate has the required form with 9 = —7/2, and thesecond alternate has the required 
form with 6 = 7/2. 


If a0, the third equation can be solved to give c =-—bd/a. Substituting this into the 
second equation, 


b2d2/a2 + d2 = 1 or 062d? + a2d2 = a? or (b2+a?)d? = a or a% = ad 


and therefore a=d or a=-—d. If. az=-—d, then the third equation yields c=b and so the 
fourth equation gives —a?2 — c2 = 1 which is impossible. Thus a@=d. But then the third equa- 
tion gives 6 =—c and so 


Since a? +c? = 1, there is a real number 6 such that a =cos@, c=sing and hence A has the 
required form in this case also. 


SYMMETRIC OPERATORS AND CANONICAL FORMS IN EUCLIDEAN SPACES 
13.31. Let T be a symmetric operator. Show that: (i) the characteristic polynomial A(t) of 


T is a product of linear polynomials (over R); (ii) T has a nonzero eigenvector; 
(iii) eigenvectors of T belonging to distinct eigenvalues are orthogonal. 


(i) Let A be a matrix representing T relative to an orthonormal basis of V; then A = At. Let 
A(t) be the characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A 
has only real eigenvalues by Theorem 13.8. Thus 


A(t) = (E=2y)(b = Ag)> >= (E— Ag) 
where the }; are all real. In other words, A(t) is a product of linear polynomials over R. 
(ii) By (i), T has at least one (real) eigenvalue. Hence T has a nonzero eigenvector. 
(iii) Suppose T(v) = dv and T(w) = nw where \ ~* u. We show that dv, w) = lv, w): 
AV, W) = (Av, w) = (T(v),w) = (v,Tw)) = (v, nw) = plo, w) 


But }\#u; hence (v,w)=0 as claimed. 


13.32. Prove Theorem 13.14: Let T be a symmetric operator on a real inner product space 


V. Then there exists an orthonormal basis of V consisting of eigenvectors of Te 
that is, T can be represented by a diagonal matrix relative to an orthonormal basis. 


The proof is by induction on the dimension of V. If dimV = 1, the theorem trivially holds. 
Now suppose dimV=xn> 1. By the preceding problem, there exists a nonzero eigenvector v, of 
T. Let W be the space spanned by 14, and let u, be a unit vector in W, eg. let uw, = V,/\\v,)I. 


CHAP. 13] INNER PRODUCT SPACES 301 


13.34. 


Since v, is an eigenvector of 7, the subspace W of V is invariant under T. By Problem 13.21, 
W~ is invariant under 7* = T. Thus the restriction iy of T to W+ isa symmetric operator. By 
Theorem 13.2, V=W® W*. Hence dimwt =n— 1 since dimW=1. By induction, there 
exists an orthonormal basis {w,...,u,,} of wt consisting of eigenvectors of T and hence of T. But 


(uy,u;) =0 for 7=2,...,n because u,;eW . Accordingly {u,,wW.,...,u,} is an orthonormal set 
and consists of eigenvectors of T. Thus the theorem is proved. 


epee : 
Let;.A- = [ 1): Find a (real) orthogonal matrix P for which P‘AP is diagonal. 


The characteristic polynomial A(t) of A is 


ie ab i ere 


A(t) = |h—Al = Mes Cet 


= — 26-35 = 0 ate) 


and thus the eigenvalues of A are 3 and —1. Substitute t= 3 into the matrix tJ —A to obtain the 
corresponding homogeneous system of linear equations 


Ap DAT =") =p see, SY 
A nonzero solution is v, = (1,1). Normalize v, to find the unit solution Lig = (1/2, 1/2). 
Next substitute ¢ = —1 into the matrix tJ — A to obtain the corresponding homogeneous system 
of linear equations 
hie Oy —— (Oy, Pi Ay SY 
A nonzero solution is vg = (1,—1). Normalize v, to find the unit solution wu, = (1/v2, —1/V2). 
Finally let P be the matrix whose columns are uw, and wo respectively; then 
L/y ely 3 a0 
and PAP 
V2 —1/V2 Or gk 


As expected, the diagonal entries of P‘AP are the eigenvalues of A. 


Ned Meas 
et Ae al el eve sk Find a (real) orthogonal matrix P for which P‘AP is diagonal. 
i bale Katy 
First find the characteristic polynomial A(t) of A: 
b— 2, =i) all 
DAC ee tA Silyiny MAS A eal === Uh Uh)? (Gre 4) 
il il j= 2 


Thus the eigenvalues of A are 1 (with multiplicity two) and 4 (with multiplicity one). Substitute 
t= 1 into the matrix t]—A to obtain the corresponding homogeneous system 


eo a Yi oe — 0} p= Ui fo SN =~ — y—s2 => 0 


That is, «+y+z2= 0. The system has two independent solutions. One such solution is v= 
(1,—1,0). We seek a second solution vz = (a,b,c) which is also orthogonal to v,; that is, such that 


atb+e= 0 andalso a—b = 0 
For example, v. = (1,1,—2). Next we normalize v, and v, to obtain the unit orthogonal solutions 
uy = (1/V2,-1/V2, 0), uw = (1/V6, 1/V6, —2/V6 ) 
Now substitute t= 4 into the matrix tJ—A to find the corresponding homogeneous system 
2a — 2 = 0, SO a PE a(S —y—y+2z2 = 0 


Find a nonzero solution such as v3 =(1,1,1), and normalize v3 to obtain the unit solution 
U3 = (1/3, 1/V3, 1/V3). Finally, if P is the matrix whose columns are the uw; respectively, 


mes] U2 ANE 1Vy3 


302 INNER PRODUCT SPACES [CHAP. 13 


oro 
a) 


1 
Pe le 1) OL Gane ty) and =. PAP = 2)0 
07 2/6 2 1/3 0 


13.35. Find an orthogonal change of coordinates which diagonalizes the real quadratic form 
Hey i= (207 P2eye eye. 

First find the symmetric matrix A representing g and then its characteristic polynomial A(t): 

t—2 =il 


ell ae 


a=( A and . A(t) =-|tf—A| = 


| == At ae Leia) 


The eigenvalues of A are 1 and 3; hence the diagonal form of q is 
qe! y') =a + 8y" 
We find the corresponding transformation of coordinates by obtaining a corresponding orthonormal 
set of eigenvectors of A. 
Set t=1 into the matrix tJ —A to obtain the corresponding homogeneous system 
== = Uy Soa e = 
A nonzero solution is v, = (1,—1). Now set t = 3 into the matrix t/—A< to find the corresponding 


homogeneous system 
LD OS = SG a he Se 


A nonzero solution is v. = (1,1). As expected by Problem 13.31, v, and vy are orthogonal. Normalize 
v, and vs to obtain the orthonormal basis 


{u, = (1/V2, -1/V2), uw, = (/V2, 1/V2)} 
The transition matrix P and the required transformation of coordinates follow: 
V2 1/V2 x a! ve = (e’+y')\/V2 
IP and = al or 
—1/f2 1/2 y y = (-#' +y)/V2 


Note that the columns of P are w,; and us. We can also express x’ and y’ in terms of « and y by 
USI Seyle—eetncb als: 


a’ = (we—y)/V2, y’ = («t+y)/V2 


13.36. Prove Theorem 13.15: Let T be an orthogonal operator on a real inner product space 


V. Then there is an orthonormal basis with respect to which T has the following 
form: 


u: 
Se 
| 


oe al 
| COS #1 —Ssin 63} 


| sin 4; cos 61| 


|cos 6, —sin 6, 
hee 
| Sin 6; COS 6, 


CHAP. 13] INNER PRODUCT SPACES 303 


Let S=7+T1=T7+T*. Then S* = (7+T*)* = T*+T =S. Thus S is a symmetric 
operator on V. By Theorem 13.14, there exists an orthonormal basis of V consisting of eigenvectors 
of S. If \y,..., A denote the distinct eigenvalues of S, then V can be decomposed into the direct 
sum V=V,@®V.@°::: ®V,, where the V; consists of the eigenvectors of S belonging to \;. We 
claim that each V; is invariant under T. For suppose v € V,;; then S(v) =r,v and 


So) = (Et NTe) = T+ T-2)(0) = TS) = Tap) = »XTO) 
That is, T(v) € V;. Hence V; is invariant under T. Since the V, are orthogonal to each other, we 
can restrict our investigation to the way that T acts on each individual V;. 
On a given V,, (T+ T-1)v = S(v) = dv. Multiplying by 7, 
(GSI e NO) = 0) 
We consider the cases \; = +2 and \;~ +2 separately. If \; = +2, then (T+J)2(v) =0 which 
leads to (T+JI)(v) =0 or T(v) = +v. Thus T restricted to this V; is either I or —I. 


If \; ~ +2, then T has no eigenvectors in V; since by Theorem 13.8 the only eigenvalues of T 
are 1 or —1. Accordingly, for v #0 the vectors v and T(v) are linearly independent. Let W be 
the subspace spanned by v and T(v). Then W is invariant under T, since 


T(Le)) <= 12) = :T0) = 


By Theorem 13.2, V;= W@® Ws Furthermore, by Problem 13.24 wt is also invariant under T. 
Thus we can decompose V; into the direct sum of two dimensional subspaces W,; where the W; are _ 
orthogonal to each other and each W; is invariant under T. Thus we can now restrict our investiga- 
tion to the way T acts on each individual W,j. 


Since T?—),7 +I = 0, the characteristic polynomial A(t) of T acting on Wj is A(t) = 
t? —),t + 1. Thus the determinant of T is 1, the constant term in A(t). By Problem 13.30, the 
matrix A representing T acting on W; relative to any orthonormal basis of W; must be of the form 


cos @ —'sin ¢ 

sin @ cos 6 
The union of the basis of the W; gives an orthonormal basis of V;, and the union of the basis of the 
V; gives an orthonormal basis of V in which the matrix representing T is of the desired form. 


NORMAL OPERATORS AND CANONICAL FORMS IN UNITARY SPACES 


: : eee 7. Se ames a Se fel t 
13.37. Determine which matrix is normal: (i) A = ‘i 1 ) (ii) B= e 24 of 


Pee EN) me (ENED GD 


Since AA* #4 A*A, the matrix A is not normal. 
Dh are 1 ort ae 2. 2st 2 
a = ae Ce ANE tas 6 
Peery. Monk elt sof i 20) 22st a) 
ie, eel: Lata Gea 6 


Since BB* = B*B, the matrix B is normal. 


(ii) BB* 


B*B 


ll 


13.38. Let 7 be a normal operator. Prove: 
(i) T(v) =0 if and only if T*(v) = 0. 
(ii) T — AI is normal. 
| (iii) If T(v) =Av, then T*(v)=Av; hence any eigenvector of T is also an elgen- 
vector of T*. 
(iv) If T(v)=rAiv and T(w) = A2w where \1% 22, then (v,w)=0; that is, eigen- 
vectors of T belonging to distinct eigenvalues are orthonormal. 


304 


13.39. 


13.40. 


INNER PRODUCT SPACES : [CHAP. 13 


(i) We show that (T(v), T(v)) = (T*(v), T*(v)): 
(T(v), T(v)) = (v, T*T(v)) 


(v, TT*(v)) = (T*(v), T*(r)) 


l| 


Hence by [J3], T(v) = 0 if and only if T*(v) = 0. 


(ii) We show that T—)dI commutes with its adjoint: 

(T= (T=) Sh (TF A BD) ST Et ee MAL 
T*T —\T —dAT* + Al = (T* —X)(T— XJ) 
CE IVE Tier NL) 


Thus 7 —NI/ is normal. 


(iii) If Tw) = rv, then (T —dI)(v) 


= 0. Now 7 —NWI is normal by (ii); therefore, by (i), 
= Nh) (ey 0) Thathisy (2) ND o)e— 


0; hence T*(v) = Xv. 


(iv) We show that A,(v, w) = do(v, w): 
div, w) = (dv, w) = (T(r), w) = (v, T¥(w)) = (v,Aqw) = Ao(v, w) 


But A; ~Ag; hence (v,w) = 0. 


Prove Theorem 13.16: Let T be a normal operator on a complex finite dimensional 
inner product space V. Then there exists an orthonormal basis of V consisting of 
eigenvectors of 7; that is, 7’ can be represented by a diagonal matrix relative to an 
orthonormal basis. 


The proof is by induction on the dimension of V. If dimV=1, then the theorem trivially 
holds. Now suppose dimV =n > 1. Since V is a complex vector space, T has at least. one eigen- 
value and hence a nonzero eigenvector v. Let W be the subspace of V spanned by v and let w, be a 
unit vector in W. 


Since v is an eigenvector of 7, the subspace W is invariant under T. However, v is also an 
eigenvector of T* by the preceding problem; hence W is also invariant under T*. By Problem 13.21, 


w+ is invariant under T** = 7. The remainder of the proof is identical with the latter part of 
the proof of Theorem 13.14 (Problem 13.32). 


Prove Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional 
inner product space V. Then TJ can be represented by a triangular matrix relative 
to an orthonormal basis {21, U2, ...,%n}; that is, for 7=1,...,n, 


T(ui) = Git + iste +--+ + Qin 


The proof is by induction on the dimension of V. If dimV=1, then the theorem trivially 
holds. Now suppose dimV =n>1. Since V is a complex vector space, T has at least one eigen- 
value and hence at least one nonzero eigenvector v. Let W be the subspace of V spanned by v and 
let wu, be a unit vector in W. Then wu, is an eigenvector of T and, say, T(u,) = a4,%y. 


By Theorem 13.2, V=W@Q we. Let E denote the orthogonal projection of V into wt. 
Clearly W~ is invariant under the operator HT. By induction, there exists an orthonormal basis 
bone ey OL W such that, fora 2 eee ate 

ET(u;) = Ajgls a5 Aj3g3 ae 2283 + AyjU; 
(Note that {wy, u,...,W%,} is an orthonormal basis of V.) But E is the orthogonal projection of V 
onto WwW; hence we must have 

T(uj) = ayy + apgtty + +++ + aU; 


for 1=2,...,n. This with T(u,) = a,,u, gives us the desired result. 


CHAP. 13] INNER PRODUCT SPACES 305 


MISCELLANEOUS PROBLEMS 


13.41. 


13.42. 


13.43. 


Prove Theorem 13.13A: The following conditions on an operator P are equivalent: 
(i) P=T? for some self-adjoint operator 7. 
(ii) P = S*S for some operator S. 
(iii) P is self-adjoint and (P(u),u)=0 for every u EV. 
Suppose (i) holds, that is, P = T2 where T=T*. Then P=TT=T*T and so (i) implies 


(ii). Now suppose (ii) holds. Then P* = (S*S)* = S*S** = §*S =P and go P is self-adjoint. 


Furthermore, 
(P(u), u) = (S*S(u), u) = (S(u), S(u)) = 0 


Thus (ii) implies (iii), and so it remains to prove that (ili) implies (i). 


Now suppose (iii) holds. Since P is self-adjoint, there exists an orthonormal basis CO Een 
of Ae consisting of eigenvectors of P; say, P(u;) = \ju;. By Theorem 13.8, the \; are real. Using 
(ili), we show that the \; are nonnegative. We have, for each i, 


0 = (P(uj), uj) = Aju, uj) = Aj(U;, U4) 


Thus (u;,u%;)=0 forces A; = 0, as claimed. Accordingly, VA; is a real number. Let TJ be the 


linear operator defined by 
Tu) = Y Nae = fot ten sae 


Since T is represented by a real diagonal matrix relative to the orthonormal basis {u;}, T is self- 
adjoint. Moreover, for each i, 


Tu) = TWryu) = VATm) = Vivriu = Ax = Plu) 
Since T? and P agree on a basis of V, P = T2. Thus the theorem is proved. 


Remark: The above operator T is the unique positive operator such that P= T2 (Problem 
13.93); it is called the positive square root of P. 


Show that any operator T is the sum of a self-adjoint operator and skew-adjoint 
operator. 
Set S=i(T+T*) and U=}(T—T*). Then T=S+U where 
ep aCe Te) te AES PPS a a Tet) 8 
and U* = (A(T—T*))* = T*-—T) = -4(T-T*) = —U 


i.e. S is self-adjoint and U is skew adjoint. 


II 


Prove: Let T be an arbitrary linear operator on a finite dimensional inner product 
space V. Then T is a product of a unitary (orthogonal) operator U and a unique 
positive operator P, that is, T= UP. Furthermore, if T is invertible, then U is also 


uniquely determined. 


By Theorem 13.13, T*T7 is a positive operator and hence there exists a (unique) positive operator 
P such that P2 = T*T (Problem 13.93). Observe that 


|P()||2 = (P(r), Pv)) = (Pv), v) = (T#T(v), v) = (Tr), Tr)) = ||T(r)||? (1) 
We now consider separately the cases when T is invertible and non-invertible. 
If T is invertible, then we set U = PT~1, We show that U is unitary: 
O* = (pT-1* = T-1#P* = (T*)-1P and U*O = (T*)-'PPT-1 = (T*)-1T*TT-1 = 1 
Thus U is unitary. We next set U = o-1, Then U is also unitary and T = UP as required. 
To prove uniqueness, we assume T = U)Py where Uy, is unitary and Py is positive. Then 
T*T = PEULU Po = -PolPy- = £3 


But the positive square root of T*T is unique (Problem 13.93); hence Py) =P. (Note that the 
invertibility of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also 
by (1). Multiplying U )P = UP on the right by P~! yields Uy = U. Thus U is also unique when 


T is invertible. 


306 


13.44. 


13.45. 


INNER PRODUCT SPACES [CHAP. 13 


Now suppose T is not invertible. Let W be the image of P, ie. W=ImP. We define 


DI a eaey, U,(w) = T(v) where P(v) = w (2) 


We must show that U, is well defined, that is, that P(v) = P(v’) implies T(v) = T(v'). This follows 
from the fact that P(v—v’)=0 is equivalent to ||P(v—v’)|| = 0 which forces ||T(v—v’)|| = 0 
by (1). Thus U, is well defined. We next define U,: WV. Note by (1) that P and T have the 
same kernels. Hence the images of P and 7 have the same dimension, i.e. dim (Im P) = dim We= 


dim (Im T). Consequently, wt and (Im Pye also have the same dimension. We let U, be any 
isomorphism between W+ and (Im Ty 
We next set U=U,@U,. (Here U is defined as follows: if v@V and v=w+tw’ where 
wEWw,w'e wt, then U(v) = U,(w) + Us(w’).) Now U is linear (Problem 13.121) and, if ve€ V 
and P(v) = w, then by (2) 
Thus T = UP as required. 
It remains to show that U is unitary. Now every vector «@V can be written in the form 
~ = P(v)+w’ where w’€ Ww. Then U(x) = UP(v) + Us(w’) = T(v) + Ug(w’) where (T(v), 
U,(w’)) = 0 by definition of U2. Also, (T(v), T(v)) = (Pv), P()) by (1). Thus 
(U(«), U(a)) = (T(v) + Ug(w'), T(v) + Ug(w’)) 
= (Tv), T(v)) + (Ug(w'), Ug(w’)) 
= (P(v), P(v)) + ww’, w’) = (Pv) + w’, Pv) + w’) 


= (x, x) 


To) = Uy) = Ow) = GR) 


(We also used the fact that (P(v),w’) = 0.) Thus U is unitary and the theorem is proved. 


Let (a1,@2,...) and (bi, b2,...) be any pair of points in l:-space of Example 13.5. 
Show that the sum S\aibi = a:bi + a2b2+ --+ converges absolutely. 
i=1 


By Problem 1.16 (Cauchy-Schwarz inequality), 


) 
lad; + --- + |a,b,|. = = % Sy 0; ae a; > 5i 
i=1 i=1 i=1 i=1 


which holds for every n. Thus the (monotonic) sequence of sums S, = |a,b,;| +++: + |a,b,| is 
bounded, and therefore converges. Hence the infinite sum converges absolutely. 


Let V be the vector space of polynomials over R with inner product defined by 
1 
(f, 9) ={ f(t) g(t) dt. Give an example of a linear functional ¢ on V for which 


0 
Theorem 13.5 does not hold, i.e. there does not exist a polynomial h(t) for which 
o(f) = (f,h) for every fe V. 


Let ¢: V>R be defined by ¢(f) = f(0), that is, ¢ evaluates f(t) at 0 and hence maps f(t) into 
its constant term. Suppose a polynomial h(t) exists for which 


1 
of) = 10) = Ff format 1) 
0 
for every polynomial f(t). Observe that ¢ maps the polynomial tf(t) into 0; hence by (1), 


1 
fj HO MHde 40 (2) 


0 
for every polynomial f(t). In particular, (2) must hold for f(t) = th(t), that is, 


1 
f ?h2(t)dt = 0 

0 
This integral forces h(t) to be the zero polynomial; hence ¢(f) = (f,h) = (f,0) =0 for every poly- 
nomial f(t). This contradicts the fact that ¢ is not the zero functional; hence the polynomial h(t) 
does not exist. 


CHAP. 13] INNER PRODUCT SPACES 307 


Supplementary Problems 


INNER PRODUCTS 


13.46. 


13.47. 


13.48. 


13.49. 


13.50. 


13.51. 


13.52. 


13.53. 


13.54. 


13.55. 


13.56. 


Verify that 


(QU + Ayty, by, + byvy) = aby (Uy, Vy) + ayb(uty, Vg) + ayby (uy, vy) + dyby(Uy, V9) 


More generally, prove that 


m 
= AU, = op) = > abu, Vj) 


13) 


Let u = (%,,%) and v = (y;,y) belong to R2. 

(i) Verify that the following is an inner product on R?: 
F(u,v) = xy, — 2WyYo — 2xgy, + Saye 

(ii) For what values of k is the following an inner product on R2? 
f(u,v) = #yYy — 8% Yo — 8%oY, + kay, 

(ili) For what values of a,b,c,d€R is the following an inner product on R2? 
Ff, %) = any, + bays + cay, + danys 


Find the norm of v = (1,2) © R? with respect to (i) the usual inner product, (ii) the inner 
product in Problem 13.47/(i). 


Let wu = (2,2) and v = (wy,w,) belong to C?. 
(i) Verify that the following is an inner product on C?: 
flu, v) = ZW, ae (1 =F 1)Z4We + (1 Gan 1) ZW ae 82% QW 


(ii) For what values of a,b,c,d €C is the following an inner product on C2? 
flu, v) — AZ1W, == bz1Wo =i CZyW SF dzyWs 


Find the norm of v = (1—2i,2+ 37) € C2 with respect to (i) the usual inner product, (ii) the 
inner product in Problem 13.49(i). 


Show that the distance function d(u,v) = ||v—u\|, where u,v EV, satisfies the following axiom 
of a metric space: 

[D,] d(u,v) = 0; and d(u,v) =0 if and only if u=v. 

[Dz] d(u,v) = d(v,u). 

[D3] d(u,v) = du, w) + d(w,v). 


Verify the Parallelogram Law: ||w+ || + ||w—v|| = 2{2|| + 2\|o|. 


Verify the following polar forms for (u, v): 
(i) (vy = L\lut vl]? —4\|u—||? (real case); 
(ii) (u,v) = 4lw+ |i? — Z\lu—[? + tll + tv||? —£|\u—iv||2 (complex case). 


Let V be the vector space of m Xn matrices over R. Show that (A,B) = tr(BtA) defines an inner 
product in V. 


Let V be the vector space of polynomials over R. Show that (f,g) = i f(t) g(t) dt defines an 
inner product in V. 


Find the norm of each of the following vectors: 

(i) u = (4, —j, 4,4) ER, 

(ite ye (ib — 273 42 —'5t) S CS, 

(iii) f(t) = t? — 2t+ 83 in the space of Problem WStbD, 


(iv) A = e vy in the space of Problem 13.54. 


308 INNER PRODUCT SPACES [CHAP. 13 


13.57. Show that: (i) the sum of two inner products is an inner product; (ii) a positive multiple of an 
inner product is an inner product. 


13.58. Let a,b,c GR be such that at?+bt+c=0 for every t@R. Show that b2—4ac = 0. Use this 
result to prove the Cauchy-Schwarz inequality for real inner product spaces by expanding 
||tu + v||2 = 0. 


13.59. Suppose |(w, v)| = ||z!|| ||v||. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show 
that wand v are linearly independent. 


13.60. Find the cosine of the angle 6 between u and v if: 
Gy Ge =O SBR a == (BI) mee 
(ii) w= 2t—1, v= # in the space of Problem 13.55; 


(ht) a = é ue v 


ORTHOGONALITY 
13.61. Find a basis of the subspace W of R4 orthogonal to u, = (1,—2,3,4) and wu, = (3, —5, 7, 8). 


= & ae) in the space of Problem 13.54. 


13.62. Find an orthonormal basis for the subspace W of C3 spanned by w, = (1,i,1) and wu, = (1 +7,0,2). 


13.68. Let V be the vector space of polynomials over R of degree =2 with inner product (f,g) = 
1 
J HO oe at. 
0 


(i) Find a basis of the subspace W orthogonal to h(t) = 2t+1. 


(ii) Apply the Gram-Schmidt orthogonalization process to the basis {1, t, t2} to obtain an ortho- 
normal basis {w,(t), u(t), u(t} of V. 


13.64. Let V be the vector space of 2 X 2 matrices over R with inner product defined by (A,B) = tr(BtA). 
(i) Show that the following is an orthonormal basis of V: 


iL © (yd 0 0 0 0 i 
Myo} eens 
(ii) Find a basis for the orthogonal complement of (a) the diagonal matrices, (b) the symmetric 
matrices. 


13.65. Let W be a subset (not necessarily subspace) of V. Prove: (i) w= L(W); (ii) if V has finite 
dimension, then wt t= L(W). (Here L(W) is the space spanned by W.) 


13.66. Let W be the subspace spanned by a nonzero vector w in V, and let E be the orthogonal projection 
of V onto W. Prove E(v) = Seed, 


iol] w. We call E(v) the projection of v along w. 


13.67. Find the projection of v along w if: 
GQ 12) 0 (OF 1) sine Re: 
(i)? vi = (1 — 4,2 + 31), w = (2— 7, 3) in’ €2: 
(iii) v = 2t—1, w= # in the space of Problem 13.55; 


RS yey ome ay Arian Were 8 
iv = a Ls) Wine é 9) in the space of Problem 13.54. 


13.68. Suppose {u,,...,u,} is a basis of a subspace W of V where dimV =n. Let {0 4, -.025 Opp} abeyan 
independent set of m—r vectors such that (u,, v;)=0 for each i and each j. Show that 
{V1,--+-+,Un—,} is a basis of the orthogonal complement wt. 


CHAP. 13] INNER PRODUCT SPACES 309 


13.69. Suppose {w,,.. .,U,} is an orthonormal basis for a subspace W of V. Let E:V-—V _ be the linear 


mapping defined by 
E(v) = (v, Uy)uy + (V, Ug)ty + +++ + (v, u,)U, 


Show that £ is the orthogonal projection of V onto W. 


r 


13.70. Let {w, -++,;U,} be an orthonormal subset of V. Show that, for any vEV, > |, u,)|? = ||o||?. 
(This is known as Bessel’s inequality.) SFP 


13.71. Let V be a real inner product space. Show that: 
(i) |{e|| = |o]| if and only if @t+v,u—v) =0; 
(ii) |[w + v}|? = |e]? + |lol|? if and only if (u,v) = 0. 


Show by counterexamples that the above statements are not true for, say, C2. 


13.72. uct U and W be subspaces of a finite dimensional inner product space V. Show that: (i) (U+ Ww) 
U-aw-; Gi) (Unw)t = ute+we. 


ADJOINT OPERATOR 
13.73. Let T: R® > R3 be defined by T(x, y,z) = (w+ 2y, 8a —4z,y). Find T* (x,y,z). 


13.74. Let 7: C3 > C3 be defined by 


IN (Coe yane) ee — NE Ait tOO) Yoda (Giza) el (ie OC) iat ee) 
Find T* (x; y, 2). 


13.75. . For each of the following linear functions ¢ on V find a vector u€@V_ such that ¢(v) = ,u) for 
every VE V: 


(i) @:R8—R defined by (x,y,z) = % + 2y — 32. 
(nee? — C defined by, d(@, ys2)) = 1) (24-80) y “a (de 22)2. 
(iii) ¢: V>R defined by ¢(f) = f(1) where V is the vector space of Problem 13.63. 


13.76. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the 
kernel of T, i.e. Im T* = (Ker Te . Hence rank(T) = rank(T%*). 


13.77. Show that T*T =0 implies T =0. 


1 
13.78. Let V be the vector space of polynomials over R with inner product defined by (f,g) = 1 f(t) g(t) dt. 
0 


Let D be the derivative operator on V, i.e. D(f) = df/dt. Show that there is no operator D* on V 
such that (D(f),g) = (f,D*(g)) for every f,g @V. That is, D has no adjoint. 


UNITARY AND ORTHOGONAL OPERATORS AND MATRICES 
13.79. Find an orthogonal matrix whose first row is: (i) (1/V5, 2/V5); (ii) a multiple of (1,1, 1). 


13.80. Find a symmetric orthogonal matrix whose first row is (1/3, 2/3, 2/3). (Compare with Problem 


BS O476) 
13.81. Find a unitary matrix whose first row is: (i) a multiple of (1,1—7); (ii) G, 41, 4 — 42) 


13.82. Prove: The product and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal 
matrices form a group under multiplication called the orthogonal group.) 


13.83. Prove: The product and inverses of unitary matrices are unitary. (Thus the unitary matrices 
form a group under multiplication called the unitary group.) 


13.84. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 


310 INNER PRODUCT SPACES [CHAP. 13 


13.85. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix 
P such that B = P*AP. Show that this relation is an equivalence relation. 


13.86. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal 
matrix P such that B = PtAP. Show that this relation is an equivalence relation. 


13.87. Let W be a subspace of V. For any v€V let v=w+w’ where wEW, w’& we. (Such a sum 
is unique because V = W @® wt .) Let T:V—>V be defined by T(v) =w—w’. Show that T is 
a self-adjoint unitary operator on V. 


13.88. Let V be an inner product space, and suppose U: V>V (not necessarily linear) is surjective (onto) 
and preserves inner products, i.e. (U(v), U(w)) = (u,w) for every v,wEV. Prove that U is 
linear and hence unitary. 


POSITIVE AND POSITIVE DEFINITE OPERATORS 
13.89. Show that the sum of two positive (positive definite) operators is positive (positive definite). 


13.90. Let T be a linear operator on V and let f:VxXV-—K be defined by f(u,v) = (T(u),v). Show 
that f is itself an inner product on V if and only if T is positive definite. 


13.91. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kI + # is positive 
(positive definite) if k =0 (k > 0). 


13.92. Prove Theorem 13.18B, page 288, on positive definite operators. (The corresponding Theorem 
13.13A for positive operators is proved in Problem 13.41.) 


13.93. Consider the operator T defined by T(u;) = Vi Ui, 4=1,4.-.,%; in the proof of Theorems: 
(Problem 13.41). Show that T is positive and that it is the only positive operator for which T? = P. 


13.94. Suppose P is both positive and unitary. Prove that P =I. 


13.95. An Xn (real or complex) matrix A = (a,;) is said to be positive if A viewed as a linear operator 
on K” is positive. (An analogous definition defines a positive definite matrix.) Prove A is positive 
(positive definite) if and only if a;; = aj and 


for every (%1,...,%,) in K™. 
13.96. Determine which of the following matrices are positive (positive definite): 
Lr Oped Oye Ba al Peel feo 
i gl & 0 =i 0 G 1) i s S i] 
(i) (ii) (iii) (iv) (v) (vi) 


13.97. Prove that a 22 complex matrix A = a e is positive if and only if (i) A=A*, and 
c 


(ii) a, d and ad — be are nonnegative real numbers. 


13.98. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is 
a nonnegative (positive) real number. 


SELF-ADJOINT AND SYMMETRIC OPERATORS 
13.99. For any operator T, show that T + T* is self-adjoint and T— T* is skew-adjoint. 


13.100. Suppose 7 is self-adjoint. Show that T?(v)=0 implies. T(v)=0. Use this to prove that 
T"(v) =0 alsoimplies T(v) =0 for n> 0. 


CHAP. 13] INNER PRODUCT SPACES 311 


13.101. Let V be a complex inner product space. Suppose (T(v), v) is real for every v € V. Show that 7 
is self-adjoint. 


13.102. Suppose S and T are self-adjoint. Show that ST is self-adjoint if and only if S and T commute, 
ey SE SS Gy. 


13.103. For each of the following symmetric matrices A, find an orthogonal matrix P for which PtAP is 
diagonal: 


: Pa yl if ey Ale i: bs on 
(ij A = eo ae (ii) = . cote (in) As = . 5 


13.104. Find an orthogonal transformation of coordinates which diagonalizes each quadratic form: 
(i) q(%,y) = 2x? — by + 10y?, (ii) g(x,y) = 42 + Bay — By? 


13.105. Find an orthogonal transformation of coordinates which diagonalizes the quadratic form 
Q(x, y, 2%) = 2xey + Qaz + Qyz. 


NORMAL OPERATORS AND MATRICES 
a 


>) 
13.106. Verify that A = ° 9 


) is normal. Find a unitary matrix P such that P*AP is diagonal, and- 
find P*AP. 


13.107. Show that a triangular matrix is normal if and only if it is diagonal. 


13.108. Prove that if T is normal on V, then ||T(v)|| = ||T*(v)|| for every v € V. Prove that the converse 
holds in complex inner product spaces. 


13.109. Show that self-adjoint, skew-adjoint and unitary (orthogonal) operators are normal. 


13.110. Suppose 7 is normal. Prove that: 
(i)  T is self-adjoint if and only if its eigenvalues are real. 
(ii) TJ is unitary if and only if its eigenvalues have absolute value 1. 
(iii) T is positive if and only if its eigenvalues are nonnegative real numbers. 


13.111. Show that if T is normal, then T and T* have the same kernel and the same image. 
13.112. Suppose S and T are normal and commute. Show that S+T and ST are also normal. 
13.113. Suppose 7 is normal and commutes with S. Show that T also commutes with S*. 


13.114. Prove: Let S and T be normal operators on a complex finite dimensional vector space V. Then 
there exists an orthonormal basis of V consisting of eigenvectors of both S and T. (That is, S and 
T can be simultaneously diagonalized.) 


ISOMORPHISM PROBLEMS 


13.115. Let {e,,...,e,} be an orthonormal basis of an inner product space V over K. Show that the map 
v » [v], is an (inner product space) isomorphism between V and K”. (Here [v], denotes the co- 


ordinate vector of v in the basis {e;}.) 


13.116. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the 
same dimension. 


13.117. Suppose {¢1,...,@,} and {e;,...,e,} are orthonormal bases of V and W respectively. Let T: V>W 
be the linear map defined by T(e;) = e{, for each i. Show that T is an isomorphism. 


312 INNER PRODUCT SPACES (CHAP. 13 


13.118. Let V be an inner product space. Recall (page 283) that each «© V_ determines a linear functional 
uz in the dual space V* by the definition u(v) = (v,u) for every v€&V. Show that the map 

aw” c . . ste 

ub u is linear and nonsingular, and hence an isomorphism from V onto V*. 


13.119. Consider the inner product space V of Problem 13.54. Show that V is isomorphic to R™ under the 
mapping 


G1, A192 Ay, é 
Ate Ciphins Uppy an) Cg Le ies eee aie) 
Gm1 Gmn2 +--+ mn 
where RB; = (dj, Gj, ..., Qin), the ith row of A. 


MISCELLANEOUS PROBLEMS 
13.120. Show that there exists an orthonormal basis {u,,...,u,} of V consisting of eigenvectors of T if and 


only if there exist orthogonal projections E,,...,H, and scalars y,...,\, such that: (i) T = 
AE Pe Ags Gi) iy bs = Ge Gi) 0 fore 23: 


13.121. Suppose V = U@W and suppose 7T;:U>V and T,:W-V are linear. Show that T = 
T, ® Ty, is also linear. (Here T is defined as follows: if vu@V and v=u+w where wEU,wewW, 
then JT(v) = T,(u) + T2(w).) 


13.122. Suppose U is an orthogonal operator on R? with positive determinant. Show that U is either a 
rotation or a reflection through a plane. 


Answers to Supplementary Problems 
13.47. (ii) k>9; (ili) a>0, d>0, ad—be > 0 
13.48. (i) V5, (ii) ¥13 
13.50. (i) 3V2, (ii) 5V2 
13.56. (i) |le| = V'65/12, (ii) |le|| =2V11, (ii) ||F@|| = V83/15, (iv) ||A|] = V80 
13.60. (i) cose = 9/420, (ii) cose = 15/6, (iii) cose = 2//210 
13/61..40, = (1,2,1,.0), v9 = (4, 4,051) 
13.62. £0, = (1, 4, 1)/V3, ‘v2 = (21, 1 — 84, 8 — 4) /V 24} 


13.63) (i). {f¥,@) =Te= bt, 7,0) = 122 — Bh 
(ii) {u,(t) =1, u(t) = (2t-1)/V8, w(t) = (622 — 6+ 1)/V5} 


sao of? 2).(¢ SL wf} 


13.67. (i) (0, 1/V2, 1/V2), (ii) (26+ 7i, 27+ 242)/V/14, (iii) V5 2/6, (iv) 


0 V6 
—1/V6 —14//6 


GEASS, LGas i ON == (ase Si, Meese a, =e) 


13.74. T*(x, y, z) = (—ta + 3y, (2 — 8i)x + (2 + 5Bi)z, (8 + iy — iz) 


CHAP. 13] INNER PRODUCT SPACES 


13.75. Let w= g(e;)ey + --- + 4(e,)e, where {e;} is an orthonormal basis. 


(i) “w= (1, 2, —3), (ii) w= (—i, 2— 87,1421), (iii) w= (1842 — 8t + 18)/15 


13.79. 


ie ONE : 1/3 ae ie 
(i Pee ec ania, Oe lve, tive 


2/V6 —1/V6 —1/V6 
L/S 2/3) 92/3 


13.80. 2/3 —2/3 1/8 
2/3. fs 2/3 


mie 
VV3 (1 —a/V8 
13.81. (i Pees a 
y aie 1/3 ) Hae ee ME a 


13.96. Only (i) and (v) are positive. Moreover, (v) is positive definite. 


2/5 —1/V5 2/V5 —1/V5 ths 3/V10 —1/V 10 
; : i 2 ; Sayre ; Bs 
van Cae 4) me ee 2/5 aD —1/V10 3/V/10 


13.104. (i) a = (3a’—y’)/V10, y = (a’ + 3y’)/V10, (ii) x = (22’—y')/V5, y = (a' + 2y’)/V5 


13.105. 2 = a'/V3 + y/V242/V6, y = 2'/V38—y'/V242/V6, z = w'/V3 — 22'/V6 


13.106. P = 
ce 1/2 O Piet 


1/2 Sa Legh ene fie 0 


313 


Vink 
‘ 
#, 
‘7 


Appendix A 


Sets and Relations 


SETS, ELEMENTS 


Any well defined list or collection of objects is called a set; the objects comprising the 
set are called its elements or members. We write 


p€A_ if pis an element in the set A 
If every element of A also belongs to a set B, ie. if x € A implies « € B, then A is called 
a subset of B or is said to be contained in B; this is denoted by 
AGB oro. 8 DA 


Two sets are equal if they both contain the same elements; that is, 

A=B8B ifandonlyif ACB and BCA 
The negations of p€ A, ACB and A=B are written p¢A, AGB and A+B 
respectively. 


We specify a particular set by either listing its elements or by stating properties which 
characterize the elements in the set. For example, 


AN = LS. 5s yoy 
means A is the set consisting of the numbers 1, 3, 5, 7 and 9; and 
B = {x: xis a prime number, x < 15} 
means that B is the set of prime numbers less than 15. We also use special symbols to 
denote sets which occur very often in the text. Unless otherwise specified: 
N the set of positive integers: 1,2,3,...; 
Z: = the set of integers: ..., —2;—1,.0,1,2,...:; 


II 


Q = the set of rational numbers; 
R = the set of real numbers; 
C = theset of complex numbers. 


We also use @ to denote the empty or null set, i.e. the set which contains no elements; this 
set is assumed to be a subset of every other set. 

Frequently the members of a set are sets themselves. For example, each line in a set 
of lines is a set of points. To help clarify these situations, we use the words class, collection 
and family synonymously with set. The words subclass, subcollection and subfamily have 
meanings analogous to subset. 

Example A.l: The sets A and B above can also be written as 
A = {4 GN >-2+isodd,:« < 10} and BEA 2 es, Oe dp tS 
Observe that 9€A but 9¢B, and 11€B but 11¢A; whereas 3€A and 
38€B, and 6¢A and 6€B. 


Example A.2: The sets of numbers are related as follows: NCZCQCRCC., 
Example A.3:--Let C = {w: 2=4, wis odd}. Then C=@, that is, C is the empty set. 


Example A.4: The members of the class {{2, 3}, {2}, {5, 6}} are the sets {2, 3}, {2% and {5, 6}. 


315 


316 SETS AND RELATIONS [APPENDIX A 


The following theorem applies. 


Theorem A.1: Let A,B and C be any sets. Then: (i) ACA; (ii) if ACB and BCA, 
then A =B; and (iii) if ACB and BCC, then ACC. 

We emphasize that ACB does not exclude the possibility that A= B. However, if 
ACB but A+B, then we say that A is a proper subset of B. (Some authors use the 
symbol C for a subset and the symbol C only for a proper subset.) 

When we speak of an indexed set {ai: i€ I}, or simply {ai}, we mean that there is a 
mapping ¢ from the set J to a set A and that the image ¢(7) of i € J is denoted a;. The set 
I is called the indexing set and the elements a; (the range of ¢) are said to be indexed by J. 
A set {d1, a2, ...} indexed by the positive integers N is called a sequence. An indexed class 
of sets {Ai: 7 € J}, or simply {Ai}, has an analogous meaning except that now the map ¢ 
assigns to each 71 €/ a set A; rather than an element ai. 


SET OPERATIONS 


Let A and B be arbitrary sets. The wnion of A and B, written AUB, is the set of 
elements belonging to A or to B; and the intersection of A and B, written ANB, is the set 
of elements belonging to both A and B: 

AUB = (2: 4 € A ore € Band “ANB = 2 2-e A-and 4c, 
If ANB=Q, that is, if A and B do not have any elements in common, then A and B are 
said to be disjoint. 

We assume that all our sets are subsets of a fixed universal set (denoted here by U). 
Then the complement of A, written A‘, is the set of elements which do not belong to A: 

Avail eG a Cea 


Example A.5: The following diagrams, called Venn diagrams, illustrate the above set operations. 
Here sets are represented by simple plane areas and U, the universal set, by the 
area in the entire rectangle. 


AUB is shaded 


: 


ANB is shaded 


A°¢ is shaded 


APPENDIX A] SETS AND RELATIONS 317 


Sets under the above operations satisf i i iti : : 
y various laws or identit h i 
the table below. In fact, we state peau aaa ore! 


Theorem A.2: Sets satisfy the laws in Table 1. 


LAWS OF THE ALGEBRA OF SETS 


Idempotent Laws 
laeecAlU Arts 54) Ib; ANA -= A 


Associative Laws 
(AUB)UC = AU(BUC) 2b. (ANB)NAC = ANn(BNC) 


Commutative Laws 
AWB = BUA ob Ame = Bins. 


Distributive Laws 
AU(BNC) = (AUB)N (AUC) 4p. AN(BUC) = (ANB)U(ANC) 


Identity Laws 
AUD = Se Ann, Cie Al 
AUU = 6b. AND = @ 


Complement Laws 
AUAC DIMA Ae = 
(Ajo = A 8b-— UC=Q, Oc =U 


De Morgan’s Laws 
(AUB)¢ = AcnBe 9b. (ANB)¢ = AcuUBe 


Table 1 


Remark: Each of the above laws follows from an analogous logical law. For example, 
Aiba) {roc © Aland « 6B} = {779 e, Band ¢ € A} = Biya 
(Here we use the fact that the composite statement “p and q”’, written pq, is 


logically equivalent to the composite statement “gq and p”’, i.e. AD.) 


The relationship between set inclusion and the above set operations follows. 


Theorem A.3: Each of the following conditions is equivalent to ACB: 
(i) ANB=A (ii) Bec Ac (v) BUAC=U 
(ii) AUB=B (iv) ANBe=@ 
We generalize the above set operations as follows. Let {Ai:7€J} be any family of 
sets. Then the union of the Ai, written U,.,A, (or simply U;Aj), is the set of elements 


each belonging to at least one of the Ai; and the intersection of the Ai, written N,.,A, OF 
simply 9; Ai, is the set of elements each belonging to every Ai. 


PRODUCT SETS 
Let A and B be two sets. The product set of A and B, denoted by A xB, consists of all 


ordered pairs (a,b) where a€ A and DEB: 
AXB =“{((a,b):a GA, DEB} 


The product of a set with itself, say AXA, is denoted by A?. 


318 SETS AND RELATIONS [APPENDIX A 


Example A.6: The reader is familiar with the cartesian plane R2=RXR as shown below. Here 
each point P represents an ordered pair (a, b) of real numbers, and vice versa. 


Example A.7: Let A = {1,2,3} and B= {a,b}. Then 
A XB =. {(1, a), (5.6), (25 342 50),.G, 2) rey, 


Remark: The ordered pair (a,b) is defined rigorously by (a,b) = {{a}, {a,b}}. From this 
definition, the “order” property may be proven; that is, (a,b) = (c,d) if and only 
if -G' = efand “6 =d: 


The concept of product set is extended to any finite number of sets in a natural way. 
The product set of the sets A1,...,Am, written A1 x A2X +--+ X Am, is the set consisting of 
all m-tuples (a1, @2,...,@m) where a; € A; for each 7. 


RELATIONS 


A binary relation or simply relation R from a set A to a set B assigns to each ordered 
pair (a,b) © AXB exactly one of the following statements: 


(i) “ais related to b”, written afb, 
(ii) “a is not related to b”, written ak b. 


A relation from a set A to the same set A is called a relation in A. 


Example A.8: Set inclusion is a relation in any class of sets. For, given any pair of sets A and B, 
either ACB or ACB. 


Observe that any relation R from A to B uniquely defines a subset R of A x B as follows: 
R = {(a,b):aRb} 
Conversely, any subset Roof AXB defines a relation from A to B as follows: 
aR b if and only if (a,b)€R 


In view of the above correspondence between relations from A to B and subsets of A x B, 
we redefine a relation as follows: 


Definition: A relation R from A to B is a subset of A X B. 


EQUIVALENCE RELATIONS 
A relation in a set A is called an equivalence relation if it satisfies the following axioms: 
[Ei] Every a € A is related to itself. 
[Z2] If a is related to b, then b is related to a. 
[Hs] If a is related to b and b is related to c¢, then a is related to c. 
In general, a relation is said to be reflexive if it satisfies [E:], symmetric if it satisfies [E2], 


and transitive if it satisfies [#3]. In other words, a relation is an equivalence relation if 
it is reflexive, symmetric and transitive. 


APPENDIX A] SETS AND RELATIONS 319 


Example A.9: Consider the relation C of set inclusion. By Theorem A.1, ACA for every set A; 
and if ACB and BCC, then ACC. That is, C is both reflexive and transitive. 
On the other hand, C is not symmetric, since ACB and A+B implies BCA. 


Example A.10: In Euclidean geometry, similarity of triangles is an equivalence relation. For if 
a, 8 and y are any triangles, then: (i) a is similar to itself; (ii) if a is similar to 8, 
then £ is similar to a; and (iii) if a is similar to @ and 8B is similar to y, then a is 
similar to y. 


If RF is an equivalence relation in A, then the equivalence class of any element a €A, 
denoted by [a], is the set of elements to which a is related: 


[a] = {w#: aR x} 
The collection of equivalence classes, denoted by A/R, is called the quotient of A by R: 
A/R. = {{a]: a € A} 
The fundamental property of equivalence relations follows: 


Theorem A.4: Let R be an equivalence relation in A. Then the quotient set A/R is a 
partition of A, i.e. each a € A belongs to a member of A/R, and the mem- 
bers of A/R are pairwise disjoint. 


Example A.11: Let Rs be the relation in Z, the set of integers defined by 
2 =y (mod 5) 


which reads “a is congruent to y modulo 5” and which means “x — y is divisible by 5”. 
Then Ff; is an equivalence relation in Z. There are exactly five distinct equivalence 
classes in Z/Rs: 


SP, SEVipe vd i oap iy 
Seo 219, 344 6011} 
Tig. aoa i) nae 8, 2.7712) 
A ee 1, 2358.13) 
Agee ede 61409 012) 


Now each integer x is uniquely expressible in the form x =5q+r where 0=r< 5; 
observe that x € EH, where r is the remainder. Note that the equivalence classes 
are pairwise disjoint and that Z = AyUA,UA,UA3VUA,. 


Appendix B 


Algebraic Structures 


INTRODUCTION 


We define here algebraic structures which occur in almost all branches of mathematics. 
In particular we will define a field which appears in the definition of a vector space. We 
begin with the definition of a group, which is a relatively simple algebraic structure with 
only one operation and is used as a building block for many other algebraic systems. 


GROUPS 


Let G be a nonempty set with a binary operation, i.e. to each pair of elements a,b ©€G 
there is assigned an element ab € G. Then G is called a group if the following axioms hold: 
[Gi] For any a,b,c © G, we have (ab)e=a(bec) (the associative law). 

[G2] There exists an element e €G, called the identity element, such that ae = ea=a for 
every aE&G. 

[Gs] For each a €G there exists an element a !€G, called the inverse of a, such that 
Cia 1-710 9.0 ine. 

A group G is said to be abelian (or: commutative) if the commutative law holds, i.e. if 
ab = ba for every a,b EG. 

When the binary operation is denoted by juxtaposition as above, the group G is said 
to be written multiplicatively. Sometimes, when G is abelian, the binary operation is de- 
noted by + and G is said to be written additively. In such case the identity element is 


denoted by 0 and is called the zero element; and the inverse is denoted by —a and is called 
the negative of a. 


If A and B are subsets of a group G then we write 
AB.= {gb:.a6A, b¢B}, or A+B = {a+b:.a GA, eB 
We also write a for {a}. 


A subset H of a group G is called a subgroup of G if H itself forms a group under the 
operation of G. If H is a subgroup of G and a €G, then the set Ha is called a right coset 
of H and the set aH is called a left coset of H. 


Definition: A subgroup H of G is called a normal subgroup if a~'HaCH for every a€ G. 
Equivalently, H is normal if aH = Ha for every a€&G, i.e. if the right and 
left cosets of H coincide. 


Note that every subgroup of an abelian group is normal. 


Theorem B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group 
under coset multiplication. This group is called the quotient group and is 
denoted by G/H. 


Example B.1: The set Z of integers forms an abelian group under addition. (We remark that the 
even integers form a subgroup of Z but the odd integers do not.) Let H denote the 
set of multiples of 5, ie. H = {..., —10, —5,0,5,10,...}. Then H is a subgroup 
(necessarily normal) of Z. The cosets of H in Z follow: 


320 


APPENDIX B] 


Example B.2: 


ALGEBRAIC STRUCTURES 321 
One On HAH aoe OS 5 OG 108) oO 
SL Ss gs Gite Ga WS es Pasar Ws Abe os 
(ai SERA SF Rg ee ES SD ONE Ba awe 
Geto a =f, ATs 8.182 a} 
Ana OT att) mages AO liane «3 


For any other integer n€ Z, *=n+H coincides with one of the above cosets. 


its addition table follows: _ 
4 


3 


| co] DO] | Ol 
| co] Do] HL Oo} 
S| 1B] co] po] | 
| Ol B®] co] pol 
ro] | S| A] co] 
oo] pol Ht] O| A] 


This quotient group Z/H is referred to as the integers modulo 5 and is frequently 
denoted by Zs. Analogously, for any positive integer n, there exists the quotient 
group Z, called the integers modulo n. 


The permutations of n symbols (see page 171) form a group under composition of 
mappings; it is called the symmetric group of degree n and is denoted by S,. We 
investigate Ss; here; its elements are 


eed one 8 eae 4 ea 

: =( 2 » a =(; 2 4) a =( 

Tee as 12 3 1 

a= (G5 9) Cri = (4 
doe 38 
ioe eee 


tiplication table of Sz is 


mB me wo bd 
me Ow 
— 


is the permutation which maps 1'1, 2 7, 3k. The mul- 


4 2 


(The element in the ath row and bth column is ab.) The set H = {e,o,} is a sub- 
group of S.; its right and left cosets are 


Right Cosets Left Cosets 
i ae tonny H = Xe,a;h 
H¢, = {4,09} gi1H = {44,03} 
H¢y = {49,93} 2H = {¢0, o9} 


Observe that the right cosets and the left cosets are distinct; hence H is not a normal 
subgroup of S3. 


A mapping f from a group G into a group G’ is called a homomorphism if f(ab) = 
f(a)f(b) for every a,b €G. (If f is also bijective, ie. one-to-one and onto, then f is called 
an isomorphism and G and G’ are said to be isomorphic.) If f:G->G’ is a homomorphism, 
then the kernel of f is the set of elements of G which map into the identity element e’ € G’: 


kernel of f = {a€G: f(a) =e} 


(As usual, f(G) is called the image of the mapping f:G>G’.) The following theorem 


applies. 


Theorem B.2: Let f:G—>G’ be a homomorphism with kernel K. Then kK is a normal 
subgroup of G, and the quotient group G/K is isomorphic to the image of f. 


322 ALGEBRAIC STRUCTURES [APPENDIX B 


Example B.3: Let G be the group of real numbers under addition, and let G’ be the group of 
positive real numbers under multiplication. The mapping f:G—>G’ defined by 
f(a) = 24 is a homomorphism because 


flatb) = Qe+b = 2420 = f(a) f(b) 
In particular, f is bijective; hence G and G’ are isomorphic. 

Example B.4: Let G be the group of nonzero complex numbers under multiplication, and let a 
be the group of nonzero real numbers under multiplication. The mapping fe Gaon Gr 
defined by f(z) = |z| is a homomorphism because 

f(zy29) = |ey2o| = |24| [%ol| = Fle) Flee) 
The kernel K of f consists of those complex numbers z on the unit circle, i.e. for 


which |z| = 1. Thus G/K is isomorphic to the image of f, i.e. to the group of positive 
real numbers under multiplication. 


RINGS, INTEGRAL DOMAINS AND FIELDS 


Let R be a nonempty set with two binary operations, an operation of addition (denoted 
by +) and an operation of multiplication (denoted by juxtaposition). Then F is called a 
ring if the following axioms are satisfied: 

[Ri] For any a,b,c ER, we have (a+b)+c=a+(b+¢). 

[R2] There exists an element 0€R, called the zero element, such that a+0=0+a=a 
for every aER. 

[Rs] For each a € RF there exists an element —a€ R, called the negative of a, such that 
a+(—a) =(—a)+a=0. 

[R4] For any a,b €R, wehave a+b=b+a. 

[Rs] For any a,b,c € R, we have (ab)c = a(bc). 

[Re] For any a,b,c € R, we have: 
(i) ab +c) =ab+ac, and (ii) (b+c)a= ba+ca. 

Observe that the axioms [R:] through [Ru] may be summarized by saying that FR is an 
abelian group under addition. 

Subtraction is defined in R by a—b=a+(—b). 

It can be shown (see Problem B.25) that a-0=0-:a=0 for every aER. 

FR is called a commutative ring if ab = ba for every a,b € R. We also say that RF is 
a ring with a unit element if there exists a nonzero element 1€R such that a-l=l-a=a 
for every aE R. 

A nonempty subset S of FR is called a subring of R if S itself forms a ring under the 
operations of Rk. We note that S is a subring of R if and only if a,b €S implies a—bES 
and abe S. 

A nonempty subset J of RF is called a left ideal in R if: (i) a—b €I whenever a,b El, 
and (ii) ra € I whenever re Rk, a€I. Note that a left ideal J in R is also a subring of R. 
Similarly we can define a right ideal and a two-sided ideal. Clearly all ideals in com- 


mutative rings are two-sided. The term ideal shall mean two-sided ideal unless otherwise 
specified. 


Theorem B.3: Let J be a (two-sided) ideal in a ring R. Then the cosets {a+J:a€R} 
form a ring under coset addition and coset multiplication. This ring is 
denoted by R/I and is called the quotient ring. 

Now let FR be a commutative ring with a unit element. For any a€R, the set 

(a) = {ra: r © R} is an ideal; it is called the principal ideal generated by a. If every ideal 

in F# is a principal ideal, then FR is called a principal ideal ring. 


Definition: A commutative ring R with a unit element is called an integral domain if R 
has no zero divisors, i.e. if ab =0 implies a=0 or D=0. 


APPENDIX B| ALGEBRAIC STRUCTURES 323 


Definition: A commutative ring R with a unit element is called a field if every nonzero 


a€h has a multiplicative inverse, i.e. there exists an element a-1 €R such 
thateae "ale = 1; 


A field is necessarily an integral domain; for if ab =0 and a+0, then 
Dea be) =9010b+ = a1 OO 


We remark that a field may also be viewed as a commutative ring in which the nonzero 
elements form a group under multiplication. 


Example B.5: The set Z of integers with the usual operations of addition and multiplication is the 
classical example of an integral domain with a unit element. Every ideal J in Z is 
a principal ideal, ie. J =(n) for some integer n. The quotient ring Z,, = Z/(n) 
is called the ring of integers modulo n. If n is prime, then Z,, is a field. On the 
other hand, if n is not prime then Z, has zero divisors. For example, in the ring Ze, 
23=0 and 240 and 30. 


Example B.6: The rational numbers Q and the real numbers R each form a field with respect 
to the usual operations of addition and multiplication. 


Example B.7: Let C denote the set of ordered pairs of real numbers with addition and multiplica- 


tion defined by 
(Gb) (CRG) (aiacsr0 1d) 


(a, 0)" (c; d) “= (ae— bd;ad + bc) 


Then C satisfies all the required properties of a field. In fact, C is just the field of 
complex numbers (see page 4). 


Example B.8: The set MV of all 2 by 2 matrices with real entries forms a noncommutative ring with 
zero divisors under the operations of matrix addition and matrix multiplication. 


Example B.9: Let R be any ring. Then the set R[x] of all polynomials over R forms a ring with 
respect to the usual operations of addition and multiplication of polynomials. 
Moreover, if FR is an integral domain then R[x] is also an integral domain. 


Now let D be an integral domain. We say that b divides a in D if a=be for some 
cE€D. Anelement u € D is called a unit if u divides 1, i.e. if wu has a multiplicative inverse. 
An element b € D is called an associate of a€D if b=ua for some unit wED. A 
nonunit p € D is said to be irreducible if p= ab implies a or b is a unit. 


An integral domain D is called a unique factorization domain if every nonunit a € D 
can be written uniquely (up to associates and order) as a product of irreducible elements. 


Example B.10: The ring Z of integers is the classical example of a unique factorization domain. 
The units of Z are 1 and —1. The only associates of ~ € Z aren and —n. The 
irreducible elements of Z are the prime numbers. 


Example B.11: The set D = fa+ by 13: a,b integers} is an integral domain. The units of D 
arev=l, 18 +5713 and —18 + 5713. The elements 2, 3—V13 and —§— 13 are 
irreducible in D. Observe that 4=2°2= (3 — V13 )(—3 — V13). Thus D is not 
a unique factorization domain. (See Problem B.40.) 


MODULES 

Let M be a nonempty set and let R be a ring with a unit element. Then M is said to be a 
(left) R-module if M is an additive abelian group and there exists a mapping RxM>M 
which satisfies the following axioms: 


324 ALGEBRAIC STRUCTURES [APPENDIX B 


forany 7,s€R andany mE€EM. 


We emphasize that an R-module is a generalization of a vector space where we allow 
the scalars to come from a ring rather than a field. 


Example B.12: Let G be any additive abelian group. We make G into a module over the ring Z of 


integers by defining : 
n times 


0 ~ 


ng =(gtoate £9; 09 = 0, (ng ng 


where n is any positive integer. 
Example B.13: Let R be a ring and let J be an ideal in R. Then J may be viewed as a module over R. 


Example B.14: Let V be a vector space over a field K and let T:V—>V_ be a linear mapping. 
We make V into a module over the ring K[x] of polynomials over K by defining 
f(a)v = f(T) (v). The-reader should check that a scalar multiplication has been 
defined. 


Let M be a module over R. An additive subgroup N of M is called a submodule of M 
if wE&N and kKER imply kuEN. (Note that N is then a module over R.) 


Let M and M’ be R-modules. A mapping T: M->M’ is called a homomorphism (or: 
R-homomorphism or k-linear) if 
(i) Tut+v) = Tu) +T(v) and (ii) T(ku) = kT) 
for every u,v © M andevery KER. 


Problems 


GROUPS 
B.1. Determine whether each of the following systems forms a group G: 


(i) G = set of integers, operation subtraction; 

(ii) G = {1,—1}, operation multiplication; 

(iii) G = set of nonzero rational numbers, operation division; 

(iv) G = set of nonsingular n X n matrices, operation matrix multiplication; 


(v) G={a+ bi: a,b € Z}, operation addition. 


B.2. Show that in a group G: 
(i) the identity element of G is unique; 
(ii) each a€©G has a unique inverse a—!1€ G; 
(iii)-(a=4) = = a, and) (ab) =! = bea); 


(iv) ab =ac implies b=c, and ba=ca implies b= ec. 


B.3. In a group G, the powers of a€G are defined by 


a® =e, a®™=aa"—1, a~-"=(a")-1, where nEN 


Show that the following formulas hold for any integers r,s,t € Z: (i) atas = atts, (ii) (a7)s = ars 
(iii) (at+s)t = grttst, : 


APPENDIX B} ALGEBRAIC STRUCTURES 325 


B.A. Show that if G is an abelian group, then (ab)" = ab" for any a,b€G and any integer n€Z. 
B.5. Suppose G is a group such that (ab)? = a2b2 for every a,b €&G. Show that G is abelian. 


B.6. Suppose H is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is non- 
empty, and (ii) a,b © H implies ab-1€ H. 


B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. 


B.8. | Show that the set of all powers of a€G isa subgroup of G; it is called the cyclic growp generated 
by a. 


B.9. A group G is said to be cyclic if G is generated by some a € G, ie. G= {a": nEZ}. Show that 
every subgroup of a cyclic group is cyclic. 


B.10._ Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition 
or to the set Z, (of the integers modulo m) under addition. 


B11. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint 
subsets. 


B.12. The order of a group G, denoted by |G|, is the number of elements of G. Prove Lagrange’s theorem: 
If H is a subgroup of a finite group G, then |H| divides |G|. 


B.13. Suppose |G| = p where p is prime. Show that G is cyclic. 


B.14._ Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and 
(ii) HON is a normal subgroup of G. 


B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. 


B.16. Prove Theorem B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group 
G/H under coset multiplication. 


B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian. 


B.18. Let f:G—G’ bea group homomorphism. Show that: 
(i) f(e) =e’ where e and e’ are the identity elements of G and G’ respectively; 
(iil)ee fiat) i —/ (@) se torsany, ale G. 


B.19. Prove Theorem B.2: Let f{: G—>G’ be a group homomorphism with kernel K. Then K is a normal 
subgroup of G, and the quotient group G/K is isomorphic to the image of f. 


B.20. Let G be the multiplicative group of complex numbers z such that |z| =1, and let R be the additive 
group of real numbers. Prove that G is isomorphic to R/Z. 


B.21. For a fixed g © G, let 9: G-—G be defined by 9 (a) =g~1ag. Show that G is an isomorphism of 
G onto G. 


B.22. Let G be the multiplicative group of » Xn nonsingular matrices over R. Show that the mapping 
A |A| is a homomorphism of G into the multiplicative group of nonzero real numbers. 


B.23. Let G be an abelian group. For a fixed n€Z, show that the map ata” is a homomorphism 
of G into G. 


B.24. Suppose H and WN are subgroups of G with N normal. Prove that HON is normal in H and 
H/(HQN) is isomorphic to HN/N. 


RINGS 


B.25. Show that in a ring R: 
(i) a°0 =0°a=0, (ii) a(—b) = (—a)b = —ab, (iii) (—a)(—b) = ab. 


B.26. Show that in a ring R with a unit element: (i) (-1)a = —a, (ii) (—1)(-1) = 1. 


326 


B.27. 


B.28. 


B.29. 


B.30. 


B.31. 


B.32. 


ALGEBRAIC STRUCTURES [APPENDIX B 


Suppose a2 =a for every a€R. Prove that R is a commutative ring. (Such a ring is called a 
Boolean ring.) 


A 
Let R be a ring with a unit element. We make RF into another ring R by defining a ® b=aro- tk 
AW 
and a*b=ab+a+b. (i) Verify that R is a ring. (ii) Determine the 0-element and 1-element 
A 
of R. 


Let G be any (additive) abelian group. Define a multiplication in G by a°b=0. Show that this 
makes G into a ring, 


Prove Theorem B.3: Let I be a (two-sided) ideal in a ring R. Then the cosets {a+I: a@G Rk} form 
a ring under coset addition and coset multiplication. 


Let J, and J, be ideals in R. Prove that 1, +1, and I,N/, are also ideals in R. 


Let R and R’ be rings. A mapping f: R>R’ is called a homomorphism (or: ring homomorphism) if 
(i) f(a+b) = f(a) + f(b) and (ii) flab) = f(a) f(o), 


for every a,b € R. Prove that if f: R—R’ is a homomorphism, then the set K = {rER: f(r) = 0} 
is an ideal in R. (The set K is called the kernel of f.) 


INTEGRAL DOMAINS AND FIELDS 


B.33. Prove that in an integral domain D, if ab =ac, a#0, then b=e. 

B.34. Prove that F = {a+ bV2: a,b rational} is a field. 

B.35. Prove that D = {a+ by2: a,b integers} is an integral domain but not a field. 

B.36. Prove that a finite integral domain D is a field. 

B.37. Show that the only ideals in a field K are {0} and K. 

B.38. A complex number a+ bi where a,b are integers is called a Gaussian integer. Show that the set G 
of Gaussian integers is an integral domain. Also show that the units in G are +1 and +7. 

B.39. Let D be an integral domain and let J be an ideal in D. Prove that the factor ring D/I is an integral 
domain if and only if J is a prime ideal. (An ideal J is prime if ab GI implies a€J or bE.) 

B.40. Consider the integral domain D = {a+ by13: a,b integers} (see Example B.11). If a=a+ bv 13, 
we define N(a) = a2—13b?. Prove: (i) N(aB) = N(a)N(B); (ii) @ is a unit if and only if N(e) = +1; 
(iii) the units of D are +1,18+5y13 and —18+5y13; (iv) the numbers 2,3—v13 and —3—v13 
are irreducible. 

MODULES 

B.41. Let M be an R-module and let A and B be submodules of M. Show that A+B and ANB are also 
submodules of M. 

B42. Let M be an R-module with submodule N. Show that the cosets {u+N: «w€M} form an R-module 
under coset addition and scalar multiplication defined by r(u+N)=ru+N. (This module is de- 
noted by M/N and is called the quotient module.) 

B.43. Let M and M’ be F-modules and let f:M-—>M’ be an R-homomorphism. Show that the set 
K=({wEeM: f(u) = 0} is a submodule of f. (The set K is called the kernel of ii) 

B.44. Let M be an F-module and let E(M) denote the set of all R-homomorphism of M into itself. Define 


the appropriate operations of addition and multiplication in E(M) so that E(M) becomes a ring. 


Appendix C 


Polynomials over a Field 


INTRODUCTION 


We will investigate polynomials over a field K and show that they have many properties 
which are analogous to properties of the integers. These results play an important role 
in obtaining canonical forms for a linear operator T on a vector space V over K. 


RING OF POLYNOMIALS 


Let K be a field. Formally, a polynomial f over K is an infinite sequence of elements 
from K in which all except a finite number of them are 0: 


Teeter Ox Cus 2s 01s Op) 
(We write the sequence so that it extends to the left instead of to the right.) The entry a. 


is called the kth coefficient of f. If is the largest integer for which a,~0, then we say 


that the degree of f is n, written 
deg f =n 


We also call a, the leading coefficient of f, and if Gdn =1 we call f a monic polynomial. On 
the other hand, if every coefficient of f is 0 then f is called the zero polynomial, written 
f=0. The degree of the zero polynomial is not defined. 


Now if g is another polynomial over K, say 
Olen (tees Onn, cchapee 01100) 


then the swm f +g is the polynomial obtained by adding corresponding coefficients. That 
is, if m=n then 
ft+g = (tees, ny acy Grtt Om 18, Oust D1, Co +p) 
Furthermore, the product fg is the polynomial 
fg = (..., 0, Gnbm, .<., @100 + Gobi, aobo) 


that is, the kth coefficient cx of fg is 
k 
Chan > adie = Taber aiOpori ss) + aio 
i=0 


The following theorem applies. 


Theorem C.1: The set P of polynomials over a field K under the above operations of addi- 
tion and multiplication forms a commutative ring with a unit element 
and with no zero divisors, i.e. an integral domain. If f and g are nonzero 
polynomials in P, then deg (fg) = (deg f)(deg 9). 


327 


328 POLYNOMIALS OVER A FIELD [APPENDIX C 


NOTATION 
We identify the scalar ao € K with the polynomial 


Go = (45,0) Go) 
We also choose a symbol, say t, to denote the polynomial 
boy Gre510,1 0) 
We call the symbol t an indeterminant. Multiplying t with itself, we obtain 
£2. (set Oed, 0.0) 6 cS ee 25 OF be ORO: 
Thus the above polynomial f can be written uniquely in the usual form 
FS Onl Se tO 1b Oo 


When the symbol t is selected as the indeterminant, the ring of polynomials over K is 
denoted by 
K{[t] 


and a polynomial f is frequently denoted by f(t). 

We also view the field K as a subset of K[t] under the above identification. This is pos- 
sible since the operations of addition and multiplication of elements of K are preserved 
under this identification: 

G .., 9, Go) fice gs; bo) Paha reel do + Do) 
¢. ae): Qo) *(. oes 0; bo) = (s peek |: aobo) 
We remark that the nonzero elements of K are the units of the ring K/[?}. 


We also remark that every nonzero polynomial is an associate of a unique monic poly- 
nomial. Hence if d and d’ are monic polynomials for which d divides d’ and d’ divides d, 
then d=d’. (A polynomial g divides a polynomial f if there is a polynomial h such that 
f=hg.) 


DIVISIBILITY 


The following theorem formalizes the process known as “long division’’. 


Theorem C.2 (Division Algorithm): Let f and g be polynomials over a field K with g + 0. 
Then there exist polynomials g and 7 such that 


f=agt+r 
where either r=0 or degr < deg g. 
Proof: If f=0 orif deg f < degg, then we have the required representation 


i= tgery 
Now suppose deg f = deg g, say 


f = Ont? hos bait + a. and <g.= bat” s+ byb4bp 


where dn, bm ~A0 and n=m. We form the polynomial 


Ste tee (1) 


m 


Then deg fi < deg f. By induction, there exist polynomials gq: and 7 such that 
fi = qg+r 


APPENDIX C}] POLYNOMIALS OVER A FIELD 329 


where either r=0 or degr < deg g. Substituting this into (1) and solving for f, 


ae (a eat ae +r 
which is the desired representation. 


Theorem C.3: The ring K [t] of polynomials over a field K is a principal ideal ring. If Tis 
an ideal in K|t], then there exists a unique monic polynomial d which gen- 
erates J, that is, such that d divides every polynomial f €/. 


Proof. Letd be a polynomial of lowest degree in J. Since we can multiply d by a non- 
Zero scalar and still remain in J, we can assume without loss in generality that d is a monic 
polynomial. Now suppose f € J. By Theorem C.2 there exist polynomials q and 7 such that 


f = aqd+r_ where either r=0 or degr < degd 


Now f,d es implies qd EI and hence r=f—qd€JI. But d is a polynomial of lowest 
degree in ik Accordingly, r=0 and f=dqd, that is, d divides f. It remains to show that 
d is unique. If d’ is another monic polynomial which generates J, then d divides d’ and d’ 


divides d. This implies that d=d’, because d and d’ are monic. Thus the theorem is 
proved. 


Theorem C.4: Let f and g be nonzero polynomials in K[t]. Then there exists a unique 
monic polynomial d such that: (i) d divides f and g; and (ii) if d’ divides 
f and g, then d’ divides d. 


Definition: The above polynomial d is called the greatest common divisor of f and g. If 
d=1, then f and g are said to be relatively prime. 


Proof of Theorem C.4. The set I= {mf+ng: m,n € K[t]} is an ideal. Let d be the 
monic polynomial which generates J. Note f,g € I; hence d divides f and g. Now suppose 
d’ divides f and g. Let J be the ideal generated by d’. Then f,g €J and hence JCJ. 
Accordingly, d€J and so d’ divides d as claimed. It remains to show that d is unique. 
If di is another (monic) greatest common divisor of f and g, then d divides di and d; divides 
d. This implies that d= d, because d and d; are monic. Thus the theorem is proved. 


Corollary C.5: Let d be the greatest common divisor of the polynomials f and g. Then 
there exist polynomials m and n such that d= mf+ng. In particular, if 
f and g are relatively prime then there exist polynomials m and such 
that mf+ng = 1. 


The corollary follows directly from the fact that d generates the ideal 
I = {mfitng: m,n € K[t]} 


FACTORIZATION 
A polynomial p € K{t] of positive degree is said to be irreducible if p=fg implies 
f or g is a scalar. 


Lemma C.6: Suppose p € K[t] is irreducible. If p divides the product fg of polynomials 
f,g € K[t], then p divides f or p divides g. More generally, if » divides the 
product of polynomials fi:f2...fn, then p divides one of them. 


Proof. Suppose p divides fg but not f. Since p is irreducible, the polynomials f and 
» must then be relatively prime. Thus there exist polynomials m,n © K[t] such that 
mf+np=1. Multiplying this equation by g, we obtain mfg+npg=g. But p divides fg 
and so mfg, and p divides npg; hence p divides the sum g = mfg+npg. 


330 POLYNOMIALS OVER A FIELD [APPENDIX C 


Now suppose p divides fife...fn. If p divides fi, then we are through. If not, then by 
the above result p divides the product fz...fn. By induction on n, p divides one of the poly- 
nomials f2,...,fn. Thus the lemma is proved. 


Theorem C.7 (Unique Factorization Theorem): Let f be a nonzero polynomial in K{t]. 
Then f can be written uniquely (except for order) as a product 


f = kpipe. A -Pn 
where k © K and the p; are monic irreducible polynomials in K[t]. 
Proof: We prove the existence of such a product first. If f is irreducible or if f EK, 


then such a product clearly exists. On the other hand, suppose f= gh where f and g are 
nonscalars. Then g and h have degrees less than that of f. By induction, we can assume 


9 = kioige.: 9, cand ho =" kahit 
where ki,k2€K and the g; and h; are monic irreducible polynomials. Accordingly, 


f= (kik2) 9192. ae 9rhihe. Ke 
is our desired representation. 


We next prove uniqueness (except for order) of such a product for f. Suppose 
f= kbs... De SK 01025... Ga 


where k,k’ © K and the pi, ..., Pn, Q1,---,Qm are monic irreducible polynomials. Now 91 
divides k’qi1...Qm. Since p: is irreducible it must divide one of the qi by the above lemma. 
Say 1 divides qi. Since p; and q: are both irreducible and monic, p:1=q:. Accordingly, 


Kao Dee SO ace 20m 


By induction, we have that n=m and p2o= qe, ..., Pn=Qm for some rearrangement of 
the qi. We also have that k=k’. Thus the theorem is proved. 


If the field K is the complex field C, then we have the following result which is known 
as the fundamental theorem of algebra; its proof lies beyond the scope of this text. 


Theorem C.8 (Fundamental Theorem of Algebra): Let f(t) be a nonzero polynomial 


over the complex field C. Then f(t) can be written uniquely (except for 
order) as a product 
f(t) = k(t—ni(t—1) +++ (E- 7») 


where k,7; EC, i.e. as a product of linear polynomials. 
In the case of the real field R we have the following result. 


Theorem C.9: Let f(t) be a nonzero polynomial over the real field R. Then f(t) can be 
written uniquely (except for order) as a product 


f(t) = kpi(t) po(t) - + - pm(t) 


where k €R and the p(t) are monic irreducible polynomials of degree one 
or two. 


INDEX 


Abelian group, 320 Column 
Absolute value, 4 ofa fee 35 
Addition, rank, 90 
in R”, 2 space, 67 
of linear mappings, 128 vector, 36 
of matrices, 36 Companion matrix, 228 
Adjoint, Complex numbers, 4 
classical, 176 Components, 2 
operator, 284 Composition of mappings, 121 
Algebra, Congruent matrices, 262 
isomorphism, 169 Conjugate complex number, 4 
of linear operators, 129 Consistent linear equations, 31 
of square matrices, 43 Convex, 260 
Algebraic multiplicity, 203 Coordinate, 2 
Alternating, vector, 92 
bilinear forms, 262 Coset, 229 
multilinear forms, 178, 277 Cramer’s rule, 177 
Angle between vectors, 282 Cyclic group, 325 
Annihilator, 227, 251 Cyclic subspaces, 227 
Anti-symmetric 
bilinear form, 263 Decomposition, 
operator, 285 direct sum, 224 
Augmented matrix, 40 primary, 225 
Degenerate bilinear form, 262 
Basis, 88 Dependent vectors, 86 
change of, 153 Determinant, 171 
Bessel’s inequality, 309 Determinantal rank, 195 
Bijective mapping, 123 Diagonal 
Bilinear form, 261, 277 matrix, 43 
Binary relation, 318 of a matrix, 43 
Block matrix, 45 Diagonalization, 
Bounded function, 65 Euclidean spaces, 288 
unitary spaces, 290 
C, 4 vector spaces, 155, 199 
C", 5 Dimension, 88 
Cayley-Hamilton theorem, 201, 211 Direct sum, 69, 82, 224 
Canonical forms in Disjoint, 316 
Euclidean spaces, 288 Distance, 3, 280 


Distinguished elements, 41 


unitary spaces, 290 
Division algorithm, 328 


vector spaces, 222 


Cauchy-Schwarz inequality, 4, 10, 281 Domain, 
Cells, 45 integral, 322 
Change of basis, 153 of a mapping, 121 
Characteristic, Dot product, 
equation, 200 in C”, 6 
matrix, 200 in R”, 3 
polynomial, 200, 208, 210 Dual 
value, 198 basis, 250 
vector, 198 space, 249 
Classical adjoint, 176 
Co-domain, 121 Echelon form, 
Coefficient matrix, 40 linear equations, 21 
Cofactor, 174 matrices, 41 


331 


332 


Echelon matrix, 41 
Eigenspace, 198, 205 
Eigenvalue, 198 
Eigenvector, 198 
Element, 315 
Elementary, 

column operation, 61 

divisors, 229 

matrix, 56 

row operation, 41 
Elimination, 20 
Empty set, 315 
Equality 

of matrices, 36 

of vectors, 2 
Equations (see Linear equations) 
Equivalence relation, 318 
Equivalent matrices, 61 
Euclidean space, 3, 279 
Even 

function, 83 

permutation, 171 
External direct sum, 82 


Field, 323 

Free variable, 21 
Function, 121 
Functional, 249 


Gaussian integers, 326 

Generate, 66 

Geometric multiplicity, 203 
Gram-Schmidt orthogonalization, 283 
Greatest common divisor, 329 

Group, 320 


Hermitian, 
form, 266 
matrix, 266 
Hilbert space, 280 
Hom (V, U), 128 
Homogeneous linear equations, 19 
Homomorphism, 123 
Hyperplane, 14 


Ideal, 322 
Identity, 
element, 320 
mapping, 123 
matrix, 43 
permutation, 172 
Image, 121, 125 
Inclusion mapping, 146 
Independent 
subspaces, 244 
vectors, 86 
Index 
of nilpotency, 225 
set, 316 
Injective mapping, 123 
Inner product, 279 
Inner product space, 279 
Integers modulo n, 323 


INDEX 


Integral domain, 322 
Intersection of sets, 316 
Invariant subspace, 223 
Inverse, 

mapping, 123 

matrix, 44, 176 
Invertible, 

linear operator, 130 

matrix, 44 
Irreducible, 323, 329 
Isomorphism of 

algebras, 169 

groups, 321 

inner product spaces, 286, 311 

vector spaces, 93, 124 


Jordan canonical form, 226 
Kernel, 123, 321, 326 


l,-space, 280 
Line segment, 14, 260 
Linear combination 
of equations, 30 
of vectors, 66 
Linear dependence, 86 
in R”, 28 
Linear equations, 18, 127, 176, 251, 282 
Linear functional, 249 
Linear independence, 86 
in R®, 28 
Linear mapping, 123 
matrix of, 150 
rank of, 126 
Linear operators, 129 
Linear span, 66 


Mapping, 121 
linear, 123 
Matrices, 35 
addition, 36 
augmented, 40 
block, 45 
change of basis, 153 
coefficient, 40 
column, 35 
congruent, 262 
determinant, 171 
diagonal, 43 
echelon, 41 
equivalent, 61 
Hermitian, 266 
identity, 43 
multiplication, 39 
normal, 290 
rank, 90 
row, 35 
row canonical form, 42, 68 
row equivalent, 41 
row space, 60 
scalar, 43 
scalar multiplication, 36 
similar, 155 
size, 35 


Matrices (cont.) 
square, 43 
symmetric, 65, 288 
transition, 153 
transpose, 39 
triangular, 43 
zero, 37 
Matrix representation, 
bilinear forms, 262 
linear mappings, 150 
Maximal independent set, 89 
Minimal polynomial, 202, 212 
Minkowski’s inequality, 10 
Minor, 174 
Module, 323 
Monic polynomial, 201 
Multilinear, 178, 277 
Multiplication of matrices, 37, 39 


N (positive integers), 315 
n-space, 2 
n-tuple, 2 
Nilpotent, 225 
Nonnegative semi-definite, 265 
Nonsingular, 

linear mapping, 127 

matrix, 130 
Norm, 279 

in R”, 4 
Normal operator, 286, 290, 303 
Normal subgroup, 320 
Normalized vector, 280 
Null set, 315 
Nullity, 126 


Odd, 

function, 73 

permutation, 171 
One-to-one mappings, 123 
Onto mappings, 123 
Operations with linear mappings, 128 
Operators (see Linear operators) 
Ordered pair, 318 
Orthogonal 

complement, 281 

matrix, 287 

operator, 286 

vectors, 3, 280 
Orthogonally equivalent, 288 
Orthonormal, 282 


Parallelogram law, 307 
Parity, 171 
Partition, 319 
Permutations, 171 
Polar form, 264, 307 
Polynomials, 327 
Positive 
matrix, 310 
operator, 288 
Positive definite, 
bilinear form, 265 
matrix, 272, 310 
operator, 288 


INDEX 


Primary decomposition theorem, 225 


Prime ideal, 326 
Principal ideal, 322 
Principal minor, 219 
Product set, 317 
Projection operator, 243, 308 
orthogonal, 281 
Proper 
subset, 316 
value, 198 
vector, 198 


Q (rational numbers), 315 
Quadratic form, 264 
Quotient, 

group, 320 

module, 326 

ring, 322 

set, 319 

space, 229 


R (real field), 315 
R”, 2 
Rank, 
bilinear form, 262 
linear mapping, 126 
matrix, 90, 195 
Rational canonical form, 228 
Relation, 318 
Relatively prime, 329 
Ring, 322 
Row, 
canonical form, 42 
equivalent matrices, 41 
of a matrix, 35 
operations, 41 
rank, 90 
reduced echelon form, 41 
reduction, 42 ; 
vector, 36 


Sealar, 2, 63 
mapping, 219 
matrix, 43 
Sealar multiplication, 69 
of linear mappings, 128 
of matrices, 36 
Second dual space, 251 
Self-adjoint operator, 285 
Set, 315 
Sen, 171 
Sign of a permutation, 171 
Signature, 265, 266 
Similar matrices, 155 
Singular mappings, 127 
Size of a matrix, 35 
Skew-adjoint operator, 285 
Skew-symmetric bilinear form, 263 
Solution, 
of lingar equations, 18, 23 
space, 65 
Span, 66 
Spectral theorem, 291 
Square matrices, 43 


333 


334 


Subgroup, 320 
Subring, 322 
Subset, 315 
Subspace (of a vector space), 65 
sum of, 68 
Surjective mapping, 123 
Sylvester’s theorem, 265 
Symmetric, 
bilinear form, 263 
matrix, 65 
operator, 285, 288, 300 
System of linear equations, 19 


Trace, 155 
Transition matrix, 153 
Transpose, 

of a linear mapping, 252 

of a matrix, 39 
Transposition, 172 
Triangle inequality, 293 
Triangular, 

form, 222 

matrix, 43 
Trivial solution, 19 


Union of sets, 316 


INDEX 


Unique factorization, 323 
Unit vector, 280 
Unitarily equivalent, 288 
Unitary, 

matrix, 287 

operator, 286 

space, 279 
Universal set, 316 
Upper triangular matrix, 43 
Usual basis, 88, 89 


Vector, 63 

in C", 5 

in R”, 2 
Vector space, 63 
Venn diagram, 316 


Z (integers), 315 
Z,, (ring of integers modulo n), 323 
Zero, 

mapping, 124 

matrix, 37 

of a polynomial, 44 

solution, 19 

vector, 3, 63 


us 


bl 
Ea 
2 


SCHAUM’S OUTLINE 


SERIES 


COLLEGE PHYSICS 
including 625 SOLVED PROBLEMS 
Edited by CAREL W. van der MERWE, Ph.D., 


Professor of Physics, New York University 


COLLEGE CHEMISTRY 
including 385 SOLVED PROBLEMS 


Edited by JEROME L. ROSENBERG, Ph.D., 
Professor of Chemistry, University of Pittsburgh 


GENETICS 
including 500 SOLVED PROBLEMS 


By WILLIAM D. STANSFIELD, Ph.D., 
Dept. of Biological Sciences, Calif. State Polytech, 


MATHEMATICAL HANDBOOK 


including 2400 FORMULAS and 60 TABLES 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


First Yr. COLLEGE MATHEMATICS 


including 1850 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


COLLEGE ALGEBRA 


including 1940 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech, Inst. 


TRIGONOMETRY 


including 680 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


MATHEMATICS OF FINANCE 


including 500 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


PROBABILITY 


including 500 SOLVED PROBLEMS 


By SEYMOUR LIPSCHUTZ, Ph.D., 
Assoc. Prof. of Math., Temple University 


STATISTICS 


including 875 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


ANALYTIC GEOMETRY 


including 345 SOLVED PROBLEMS 
By JOSEPH H. KINDLE, Ph.D., 


Professor of Mathematics, University of Cincinnati 


DIFFERENTIAL GEOMETRY 


including 500 SOLVED PROBLEMS 
By MARTIN LIPSCHUTZ, Ph.D., 


Professor of Mathematics, University of Bridgeport 


CALCULUS 


including 1175 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


DIFFERENTIAL EQUATIONS 


including 560 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


SET THEORY and Related Topics 


including 530 SOLVED PROBLEMS 


By SEYMOUR LIPSCHUTZ, Ph.D., 
Assoc. Prof. of Math., Temple University 


FINITE MATHEMATICS 
including 750 SOLVED PROBLEMS 


By SEYMOUR LIPSCHUTZ, Ph.D., 
Assoc. Prof. of Math., Temple University 


MODERN ALGEBRA 
including 425 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


LINEAR ALGEBRA 
including 600 SOLVED PROBLEMS 


By SEYMOUR LIPSCHUTZ, Ph.D., 
Assoc. Prof. of Math., Temple University 


MATRICES 
including 340 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


PROJECTIVE GEOMETRY 
including 200 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 


Professor of Mathematics, Dickinson College 


GENERAL TOPOLOGY 
including 650 SOLVED PROBLEMS 


By SEYMOUR LIPSCHUTZ, Ph.D., 
Assoc. Prof. of Math., Temple University 


GROUP THEORY 
including 600 SOLVED PROBLEMS 


By B. BAUMSLAG, B. CHANDLER, Ph.D., 
Mathematics Dept., New York University 


VECTOR ANALYSIS 
including 480 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


ADVANCED CALCULUS 


including 925 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


COMPLEX VARIABLES 


including 640 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


LAPLACE TRANSFORMS 


including 450 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselaer Polytech. Inst. 


NUMERICAL ANALYSIS 


including 775 SOLVED PROBLEMS 
By FRANCIS SCHEID, Ph.D., 


Professor of Mathematics, Boston University 


DESCRIPTIVE GEOMETRY 


including 175 SOLVED PROBLEMS 


By MINOR C. HAWK, Head of 
Engineering Graphics Dept., Carnegie Inst. of Tech. 


ENGINEERING MECHANICS 
including 460 SOLVED PROBLEMS 
By W. G. McLEAN, B.S, inv E.E., M<S., 


Professor of Mechanics,Lafayette College 


and E. W. NELSON, B.S. in’M-E., M- Adm. E., 


Engineering Supervisor, Western Electric Co. 


- PeES 


THEORETICAL | 


including 720 SOLVED PROBLEMS. 


By MURRAY R. SPIEGEL, Ph.D., 


Professor of Math., Rensselk 
oe 


LAGRANGIAN DYNAMICS 
including 275 SOLVED PROBLEMS 
By D. A. WELLS, Ph.D., 


Professor of Physics, University of Cincinnati 


STRENGTH OF MATERIALS 
including 430 SOLVED PROBLEMS 
By WILLIAM A. NASH, Ph.D., 


Professor of Eng. Mechanics, University of Florida 


FLUID MECHANICS and HYDRAULICS 
including 475 SOLVED PROBLEMS 
By RANALD V. GILES, B.S., M.S. in C.E., 


Prof. of Civil Engineering, Drexel Inst. of Tech. 


FLUID DYNAMICS 
including 100 SOLVED PROBLEMS 
By WILLIAM F. HUGHES, Ph.D., 


Professor of Mech. Eng., Carnegie Inst. of Tech. 


* and JOHN A. BRIGHTON, Ph.D., 
Asst. Prof. of Mech. Eng., Pennsylvania State U. 


ELECTRIC CIRCUITS 
including 350 SOLVED PROBLEMS 


By JOSEPH A. EDMINISTER, M.S.E.E., 
Assoc. Prof. of Elec. Eng., University of Akron 


ELECTRONIC CIRCUITS 
including 160 SOLVED PROBLEMS 
By EDWIN C. LOWENBERG, Ph.D., 


Professor of Elec. Eng., University of Nebraska 


FEEDBACK & CONTROL SYSTEMS 
including 680 SOLVED PROBLEMS 
By J. J. DISTEFANO III, A. R. STUBBERUD, 


and I. J. WILLIAMS, Ph.D., 
Engineering Dept., University of Calif., at L.A. 


TRANSMISSION LINES 
including 165 SOLVED PROBLEMS 
By R, A. CHIPMAN, Ph.D., ; 


Professor of Electrical Eng., University of Toledo 


REINFORCED CONCRETE DESIGN 


including 200 SOLVED PROBLEMS 


By N. J. EVERARD, MSCE, Ph.D., 
Prof. of Eng. Mech. & Struc., Arlington State Col. 


and J. L. TANNER II1, MSCE, 


Technical Consultant, Texas Industries Inc. 


MECHANICAL VIBRATIONS 


including 225 SOLVED PROBLEMS 


By WILLIAM W. SETO, B.S. in M.E., M.S., 
Assoc. Prof. of Mech. Eng., San Jose State College 


MACHINE DESIGN 


including 320 SOLVED PROBLEMS 
By HALL, HOLOWENKO, LAUGHLIN 


Professors of Mechanical Eng., Purdue University 


BASIC ENGINEERING EQUATIONS 


including 1400 BASIC EQUATIONS 
By W. F. HUGHES, E. W. GAYLORD, Ph.D., 


Professors of Mech. Eng., Carnegie Inst. of Tech, 


ELEMENTARY ALGEBRA 


including 2700 SOLVED PROBLEMS 
By BARNETT RICH, Ph.D., 
Head of Math. Dept., Brooklyn Tech. H.S. 


_ PLANE GEOMETRY 
including 850 SOLVED PROBLEMS 
By BARNETT RICH, Ph.D., 
__Head of Math. Dept., Brooklyn Tech. H.S. 


ST ITEMS IN EDUCATION 


including 3100 TEST ITEMS: 
By GJ. MOULY, Ph.D., L.E. WALTON, Ph.D., 


Pd Professors of Education, University of Miami 


Cent 


Aimar 


